Wednesday, March 17, 2010

The magic twenty percent

The current project is maintenance on a library that reads and writes spreadsheet files (Lotus 1-2-3 and Microsoft Excel). The term 'maintenance' is a bit misleading, as the project is really a re-write of the existing code.

But here's the interesting thing: our new code is much smaller than the previous version. Much, much smaller. Not ten percent or twenty percent smaller, but eighty percent smaller. The original code was 98,000 lines (roughly). The new code is about 20,000 lines. the new code does everything that the old code does.

So the question is: how can I write code that is so small, so compact, so efficient? Am I that good of a programmer?

This is not the first time that I have replaced code with more efficient code. A previous project saw me re-writing a project and reducing the code from almost 30,000 lines to about 3000 lines. On that project, I replaced Java code with C#. Since Java and C# are at the same level (just about), the reduction could not have been due to the change in language. I think the benefit came from the change to the data format. The original system read XML files and processed them as scripts, using a lot of auxiliary Java classes and a limited scripting language. The replacement system (in C#/.NET) used plain text scripts and required no auxiliary classes. This was due to the capabilities of the scripting language, not the capabilities of C# or limitations of Java. Changing the data (the scripts) to a simpler and more expressive form made the coding much easier.

The current assignment has a different twist. In this assignment, the data is not changing (it's all spreadsheet files for Lotus 1-2-3 or Excel) but the language is helping. The previous program used "straight" C++; the new program uses C++ with STL. Changing the language made the coding much easier.

There are some other factors that contributed. On both assignments, I was the sole programmer. As the only person coding, I was able to make decisions quickly and without debate.

But the real factor here was the automated testing. Both projects had automated tests in place. With them, I was able to make sweeping changes as I needed. Most programmers want to make these, but refrain because of the risk. Managers and code architects shy away from major changes, because major changes have large risks.

Yet with automated tests, the risks were small. I could make large changes (I made some large changes today) and then run tests to ensure that the system was working as expected. My changes were not correct, initially, and I made corrections. With automated tests I can make grand, sweeping changes that improve the code... and know that the changes are correct.

So with two data points (not a lot, I admit) I am looking at large reductions in code due to automated testing. I couple that with the conjecture that reduced code size leads to reduced development costs (fewer deveopers, fewer defects) and better software.


No comments:

Post a Comment