Thursday, April 8, 2010

Performance is not magic

I and a few other folks spent a bunch of time examining programs today. Our goal was to identify performance problems. We have several C++ programs that are running but slower than we would like. And they are running slower than the older generation of C++ programs, in some areas significantly slower.

It turns out that the performance issues are caused by STL. The previous generation of programs used "straight" C++ without STL. Our new programs make heavy use of the library. We like STL, since it makes programming much easier. The STL allows for easy use of strings and collections, and we don't have to worry about memory allocation...

And that's the cause of our problems.

The STL collections, from simple strings (which can be viewed as collections of 'char' elements) to vectors and maps requires overhead. It's not much, and the implementors have done a lot of work to make the STL work efficiently, but the difference remains. With our data set (hundreds of thousands of items) the small differences add up to noticeable performance problems.

It's like we wrote the program in Java or C#, with their built-in memory management and garbage collection. Actually, it's a bit worse, since STL uses local copies for everything and doesn't take advantage of copy-on-write or multiple references. It instantiates and copies data with wild abandon. (OK, perhaps not wild abandon. But it makes lots of copies.)

Our previous generation of programs was optimized for memory usage, at the cost of manually tracking every object in memory. The programs are also significantly larger, running at five times the number of lines of code. They were fast to run but slow to write and slow to maintain. (Read "slow" as "expensive".)

The realization of STL performance problems raised and lowered our spirits. It raised them, since we finally understood the cause of the problem. It lowered them, as we also understood the solution.

As Heinlein said in "The Moon is a Harsh Mistress": There Ain't No Such Thing as Free Lunch. If we want shorter development cycles and understandable code, we can use STL but the price is lower run-time performance. If we want fast programs, we have to avoid STL (or perhaps reduce our use of it) and write the traditional hand-optimized code.

Do we want to drink our poison from the large cup or the small cup?


No comments:

Post a Comment