Tuesday, March 2, 2010

Modest success with XLS

Today, shortly after 14h (that's 2:00 PM in civilian notation) I had a modest success with spreadsheet utilities.

I've been working with spreadsheet files since October. With hard work, creative programming, documentation, and a few guesses, I have successfully put together programs to read Lotus 1-2-3 files (WK1, WK3, WK4, and 123 formats) and Microsoft Excel files (XLS and XLSX formats). The programs read the files directly, yank out the contained values, and write them to text.

I've also been working on programs to do the reverse: Start with text and create the spreadsheet file. This is a bit trickier, since spreadsheet files contain lots and lots of information, and when reading the file one can ignore most of it. The task of reading the file is a matter of finding the important bits, converting them to plain text, and skipping everything else.

Creating a spreadsheet file requires generating all of the information for the file. You can't leave out records because you don't understand them. The file must contain enough information to be acceptable to the target program. To create an acceptable file, you have to understand a lot more about the file that when converting it to plain text.

In the past few weeks, I've successfully written Lotus 1-2-3 WK1 and WK3 files. Now I am working on Excel XLS files. (I'm not writing Lotus WK4 or 123 files because we don't have enough information about the file formats.)

Today I had a minor success. This afternoon I ran a series of programs that started with a text file and produced an .XLS file. I ran MS Excel and opened the .XLS file, and it worked! MS-Excel displayed the file with the proper values.

I say "minor success" because the starting file (the text file) was very carefully crafted to result in a proper XLS file. In fact, our "source file" was derived from an XLS file created by MS-Excel. We took an existing file, converted it to text, and then converted it back to XLS. (The process was a little more complex than that, but the sentiment is correct.) One might say that this was cheating, or perhaps a simple matter of conversions between encoding.

Yet the work is not without value. We now have a process to take a text file and create an XLS file. With this process, we can experiment and learn more about the XLS format. The Microsoft documentation is detailed, but it is vague on several topics. With some experiments, we can get a reliable XLS file creator. This process lets us create files at a high level (higher than the bits-and-bytes of the Microsoft document) and quickly test our theories.

My success in this was due, as usual, to our test framework. I created programs, but before that, I created tests and expected results. Once I had those in place, it was easy to create a program that read the input and provided the output. (Okay, perhaps not "easy"... but at least I knew when changes got me closer to the goal.) Our framework is simple: a few scripts (batch files), some test data, and expected results. Simple, yet adequate.

So today was a pretty good day.



No comments:

Post a Comment