Wednesday, April 27, 2016

Data tools pay for themselves over time

Several years ago I worked on a project that required we pick apart Excel .XLS files at the byte level. We didn't use Microsoft's COM components to read the file. We didn't use a third-party library. We read the file (in C++ and in binary mode) and did exactly what Microsoft recommends we do not do: we parsed the file ourselves.

As part of this effort, I built a small utility to "dump" the contents of an Excel file. The common programs extract cell values, but I wanted something more detailed. I wanted a list of the various BIFF records in the file. (Microsoft Excel .XLS files consist of a series of binary records, each record describing some aspect of the contents. Each record contains a type field, a length field, and a set of fields that depends on the type of record, and some fields have a variable size. It's a compact and powerful format.)

The utility program (called "DumpXls", cleverly enough) was useful to identify the different records needed to construct a proper .XLS file. It took some time to create, time that turned out to be an investment.

It was a useful investment, because this past week I had a problem with the Python 'xlwt' library. (The 'xlwt' library lets Python create an .XLS file.) My old "DumpXls" utility helped me diagnose the problem and find a solution... and quickly. The time spent creating that program years ago was more than made up by the time saved in finding this week's problem.

The "dump" program was simple, yet it required a good understanding of the file format. When working with a new file format, it helps one understand the format to build such a "dump" program. The program is useful, as is the knowledge of the format.

Sunday, March 13, 2016

Success with Ruby and BASIC

A while back, I started a project to learn Ruby. The project was to implement a version of BASIC, using that as a means to learn the Ruby language.

I am happy with my progress. I've learned a great deal about Ruby -- and learned more about programming. Ruby has a different "take" on programming than C++, Java, and C#, and those differences forced me to re-think a lot of ideas about programming. The design patterns for Ruby are different from the patterns of the "classical" object-oriented languages. I found that much of my code followed the classic patterns, and it worked poorly. Only after I changed my thinking about programming and my designs in Ruby did my code "work".

Rubocop helped too. It complained about many things, from spacing around operators (or lack thereof) to the number of lines and complexity of functions. At first I thought that some long functions would have to remain long -- in C++ or Java I would leave them as long functions. But with Rubocop's prodding, I redesigned them and made them smaller, many times breaking them into smaller functions.

i started using Rubocop after a significant amount of code had been written, and Rubocop reported somewhere north of 80 "violations". Over time, with thought and experimentation, I have reduced that number to one. The changes have, I believe, improved the code and made it more reliable and more readable.

My success has been due to the design of the Ruby language, the Rubocop tool, the Ruby documentation pages, and StackOverflow. These made it easy to develop in Ruby. Yet there were two other things: an open mind and time. I needed to change my thinking about programming, and accept the Ruby way of code. I also needed time. Time to think about the code, time to try things, and time to revise my initial code into something better.

Monday, January 18, 2016

Impressed with Python

I've been impressed with Python of late.

I'm using Python on a project for a client. The work involves computations with data in Microsoft Excel spreadsheets. Processing involves, at a basic level, the extraction of data from a spreadsheet, some calculations, and then the generation of a result spreadsheet (different from the first spreadsheet).

Python has been surprisingly useful here. The libraries 'xlrd' and 'xlwt' handle the reading and writing of data in spreadsheets, allowing me to focus on the computations.

Python has helped in other ways. It runs a lot of our tests, and summarizes the test results. It also drives the tests of an old MFC Windows application.

This last item is important. We had no way to test the GUI for this program -- other than manual testing. The 'PyWinAuto' package lets us use Python to "drive" the GUI and run tests.

I am impressed with PyWinAuto. Perhaps more impressed than others; some years ago I created a library (in C++, not Python) to drive MFC GUI programs. The task was not trivial and my attempt was clunky and inadequate. Yet it was that foray into Windows API calls and Windows controls that gives me the appreciation for the effort of PyWinAuto.

Over the years, I have worked with many languages: BASIC, Pascal, Fortran, C, C++, Java, Perl, C#, and even a little COBOL. My experiences with each language vary. Some were fun, others were frustrating. Python is in the former category, as I have been able to do just about everything I wanted, and with little effort.

Saturday, December 12, 2015

Backups - rsync and Google cloud storage

I've been using Google's cloud storage for backups, and I am impressed with what I have found.

My previous method was to use rsync and store backups on a local server. It works, although the local server is old -- and a bit small. Eventually, I would need a larger disc.

Google's cloud services remove that problem.

Synchronizing with the Google store is, interestingly, faster than the synchronization with the local server. The first time was longer, of course, as the Google gs_utils package had to transfer every file to the cloud. But after that first transfer, successive synchronizations have been must faster.

Google cloud services aren't free. But for pennies a month (literally), I can have backups that are accessible anywhere I have a network connection.

Saturday, October 31, 2015

New old equipment: Apple Powerbook G4

A friend provided me with an old Apple PowerBook G4. It's been sitting on the shelf for a while, staring at me. I decided to do something about it.

Examination showed that the unit was in good shape, except the keyboard. No scratches or dents, or other signs of abuse. But the keyboard was not firmly attached to the underlying layer -- the top two rows of keys had rolled away and were floating in air. This position is not untenable, but it does mean that the key hinges would come loose. More troubling is the missing keys for F1, F2, and F12.

Super-glue helped get the keyboard back in line, although the result is not perfect. The edges are still up a bit, so the ESC and F11, F12, and EJECT keys are floating up. And F1, F2, and F12 are still missing. But its workable.

The unit needed a power adapter, so that was the next order of business. It had to be -- i could not test anything else without it. So off to the good folks at! They provided me with a genuine Apple PowerBook adapter with the proper plug.

Testing showed that it would not boot from the hard drive. I could boot from a CD-ROM, and I used Ubuntu 12.10 for the PowerPC. Ubuntu came up and found the video card and the network adapter, but it could not install on the hard drive.

On the assumption that the drive had failed, I ordered a replacement. Finding drives for a PowerBook G4 is a little tricky, as it uses IDE (or PATA) drives, not the SATA drives in today's laptops. Persistence paid off, and I found a nice little 40 GB drive for $20.

Installation of the drive was challenging but not impossible. Opening the G4 requires removing quite a few small screws (about 20) all but two being Phillips head, the other two required a hex key. Carefully lifting the keyboard tray reveals the innards.

Removing the old drive was simple, once I recognized that a small bracket held the drive in place. Remove two screws, lift out the old drive, unscrew the four mounting screws, screw them into the new drive, and attach the cable and insert into its position.

Then replace the keyboard tray and its 20-odd screws.

Now for an operating system. I could install Mac OS X 10.4 (Tiger). I have a copy, and it is the latest version of Mac OS X that supported the PowerBook series. But Apple ceased supporting 10.4 quite some time ago, and the old software and Safari provided a poor experience. Also, modern software for Mac OS X doesn't want to install on 10.4. So I needed something else, and I chose Ubuntu.

The PowerBook G4 uses a PowerPC CPU, not an Intel chip. Therefore, the standard (Intel) issue of Ubuntu won't do. I needed the PowerPC version, which is available. Ubuntu PowerPC 12.10 is the last version that has an install that fits on a CD, so that was the one I selected.

Ubuntu installed without difficulty, but needed some assistance. It found drivers for video, Ethernet, and touchpad, but not for wifi. Those had to be installed from the legacy broadcom package.

There is a later version of Ubuntu (14.10 LTS) but it cannot talk to the video card. (I tried the upgrade and it failed, leaving Ubuntu in an inconsistent state. I re-installed 12.10 and left it there.)

The experience, so far, is tolerable but not excellent. The Apple PowerBook G4 has a nice 15" screen but is heavy, and it runs a bit warm. The screen is bright and clear but Ubuntu doesn't recognize the brightness controls (F9 and F10, if the keys are to be believed).

Ubuntu 12.10 LTS for PowerPC has Firefox (version 39) but not Chrome.

Performance is sluggish. Web pages load, but I guess I've been spoiled by newer computers. Yet it's a nice machine, one I'd like to use more.

Saturday, October 17, 2015

Projects and technologies

Credit subsidy calculator: Develop a replacement for the existing C++ Windows desktop application. The new version will be written in C#/.NET and will run as a web service. There is a small command-line executable that will be distributed to agencies for them to use the web service.

We expect a release of the old (C++) version in September and again in November. We may, depending on progress, see a test release of the new (C#/.NET) version in November.

Interest model and Sensitivity model: Improve and modernize the code of these two interrelated applications, preparing for conversion from C++ to C#/.NET. These may remain Windows desktop applications; unlike the Credit subsidy calculation, the Interest Model and Sensitivity Model are used by a very limited number of people.

The models are used prior to the preparation of the budget and the preparation of the mid-session review. There are a number of changes for the Sensitivity model's calculations that must be made in September.

We have a number of technical issues with these projects (mostly with the new version of the Credit subsidy calculator). The issues appear to be resolvable with the right amount of time, research, and testing.

I have upgraded my home PC (which I use for remote access) to Windows 10. I have successfully connected to the office with Windows 10 and the new Edge browser. Performance is about the same as with Windows 8 and Internet Explorer.

Sunday, June 28, 2015

Building a BASIC interpreter in Ruby

I've been working on a side project: Build a BASIC interpreter in Ruby. The purpose is to help me learn the Ruby language, and to that end the project has worked well.

Learning a new programming language can be difficult. It's easy enough to write a simple "Hello, world!" program, but that simply confirms that the compiler (or interpreter) is installed correctly. What does one do next?

I picked the BASIC interpreter as a task with some complexity, but not too much. The BASIC language gives me a challenge but not one that is insurmountable. Also, I have an early text on programming in BASIC that provides example programs with their expected outputs.

As a bonus, programming a BASIC interpreter is a stroll down memory lane. BASIC was the first programming language that I learned. It is an old friend, one I have not seen in quite some time.

So as a project, the BASIC interpreter is challenging, supported, and fun.

I spell BASIC in all capitals because of the variant I am implementing. The BASIC language had a number of variants over time, from the early Dartmouth implementation in the 1960s to the DEC versions of the 1970s and 1980s, culminating in Visual Basic 6 from Microsoft. (Or perhaps VB.NET, but that seems less BASIC than any of the variants.)

My project is to implement an early version, one that is close to Dartmouth BASIC. It has a simplicity about it, yet it also has its intricacies. Dartmouth BASIC allows one to specify user-defined functions, but only on one line and only with a very limited set of names ('FNA' through 'FNZ'). It supports some elements of structured programming but still allows GOTO statements, and one can 'GOTO' from the inside of a loop to the outside, do some work, and 'GOTO' back into the loop. (One can also 'GOTO' out of a loop and not return into the loop.)

Early version or late, my experience has been a good one. I have been forced to learn the Ruby language. While I am not an expert, I am at least comfortable with the major constructs and classes of the language. And that was the point.