A Story of a 1000x Algorithmic Runtime Performance Improvement
We needed a fast optimization algorithm for our integrated logistics and order satisfaction planning enterprise solution, targeted specifically for the poultry products manufacturers. By algorithm, I don’t mean a piece of code like sorting or finding shortest distance here, but rather a collection of algorithms to solve a large problem which has various aspects.
The reason we needed a fast algorithm was due to the nature of the problem. As most of the real world optimization problems, the problem we tackle in this case is a combinatorial optimization problem, meaning that you need to try as many alternatives as possible to increase your chance of finding the best solution. This doesn’t mean a brute force approach though, far from it actually, as all the computers combined in the world cannot try all the combinations, even if they were to be run the entire time of the human history. And this is for only one day’s problem, you will get an entirely different scenario for the next day. So, you gotta be smart about it. In any case, having a fast algorithm which can solve as many cases as possible in a relatively short time would certainly help the overall success of the system.
To do that you will first need to come up with an algorithm with a low theoretical overall computational complexity, and then implement it efficiently. I will mainly talk about how to put your design into practice here.
At this point you have two alternatives,
Knowing that you need a high performance algorithm, you code proactively. You write your code paying attention to stuff which you think might cause you some performance loss, maybe you throw around some Maps here and there for indexing and fast search by key for example.
I don’t know which approach you liked as you read, but I can tell you that most of the people tend to choose the first approach to avoid future work or for being idealist or whatever reason. So what people usually do, they do their best and hope for the best result :)
However, beware that there are few problems associated with the first approach which might not be apparent immediately.
You are constantly trying to solve a problem which you predict you will have in the future. In other words you are doing tomorrow’s work today, to avoid the potential of doing it in the future. This is one of the most basic fallacies software developers and designers keep constantly doing, and I had my fair share in it too. That is, people constantly and proactively try to solve a problem or put a feature or extend the design to accommodate needs which might be needed in the future. These needs never arise! Even if they did, it usually is not exactly the way you predicted them to be. So here you are, doing the work again, which you did some extra work to avoid doing it in the first place. Reflecting back to our performance optimization case, the potential bottlenecks you predicted may not turn out to be real or impacting in any way.
The code you put here and there to prevent potential performance issues, may turn out to be useless, and therefore cluttering your code making it harder to understand and maintain. Any code piece which does not have clear objective or well defined function is hard to understand.
Taking the first approach does not automatically mean that you do not pay attention to stuff like object oriented patterns and design principles. However predictive problem solving usually causes the lean design to deviate to unnecessary dimensions and become, well, fat.
Having taken the first approach many times in the past and learning from my mistakes, I took a different approach this time, the second one :)
One fallacy that people may fall while applying this approach is, they interpret this as no need for an overall design or vision. This is absolutely not the case, and may lead you to wrong destination instead. The essence here is not to put excessive effort and get stuck in details which you anticipate as being important for improved performance. That being said, if you are taking this approach, you are accepting beforehand that you will make improvements and do some refactoring during or after you have your functioning algorithm. So what is the first step for improvement?
You need to measure in order to be able to make improvements. Otherwise you will not be able to tell if you made an improvement or not. It is wise to use some benchmarking tool to keep record of the overall or component wise performance of your algorithm. JMH will do the job if you are using Java for instance. Besides keeping track of the overall performance, you will need to measure the various aspects of the system like CPU and memory usages in detail. Profiling tools are extremely useful in this context. They can help you monitor things like the overall CPU and memory usage, garbage collector run times and method level CPU times and so on. They are crucial in identifying potential performance issues, which brings us to the second step of the improvement cycle.
The second step of an improvement cycle is to name things, that is define clear targets and objectives which can be acted upon. If you do not have a clear objective and target, you are likely to do unnecessary work, which does not serve your purpose as I explained before. You may use models like Theory of Constraints to identify problems. This was the way I went. To go down the same road, you first identify the most constraining part of the system, improve it, remeasure and repeat the cycle.
Refactoring is the name of the game at this step. You identified the source of a problem at the previous step and if you also identified a possible resolution for the issue, the only thing left to do is refactoring your existing code to incorporate the new approach to your existing code. You are most likely to introduce new test cases and modify the existing unit tests during this step. The key here is to only incorporate enough changes to resolve the issue and no more. Kind of an extreme programming (XP) approach. This is the way you keep your design and code lean.
By repeating this cycle; measure, identify and action; you keep improving your solution at every iteration, until it is no more feasible or meaningful to introduce new improvements. By meaningful, I mean the extra effort required for the next improvement is not worth or cannot be justified for the expected performance gain.
Following this recipe, I had the first working and fully functioning algorithm which took about 2 minutes to run with a real problem data. Hardly identifiable as a fast algorithm for combinatorial optimization problems. Keep in mind that I did not do any premature code optimization at this point. After the first improvement cycle, the algorithm took about 10 seconds to run with the same data, which means I had more than 10 times performance improvement after performing only one improvement cycle. After a few more cycles, I was able to gain another 10 times improvement which made a total of 100 times performance gain. At this stage the algorithm was able to run around 1 second. But there were still some potential. After many more iterations, I finally reached a point where the algorithm is able to run around 100 milliseconds, which means more than 1000 times faster than its inception. Now we are able to evaluate 1000s of alternatives in a second on a parallel processing environment (single machine), and propose the best production and logistics plan in minutes to our customers which is far more superior than any plan they ever could hope to make by hand.