Wednesday, July 14, 2010

Plugging the leak and talking to strangers

I tracked down a memory overwrite issue today. These are still difficult problems to solve. For this one, I was lucky: the problem was repeatable, and the code had diagnostic routines that would trip the Microsoft run-time checks.

Solving a problem in software consists of the following steps:

1) Make the problem repeatable
2) Identify the point of failure
3) Understand the problem
4) Devise solutions
5) Pick a solution and implement it
6) Run tests to verify that you did not break something else

With step one as a "freebie" and diagnostic routines, identifying the point of failure was easy. I found it in less than an hour, pin-pointing the exact line of code.

Understanding the problem took more work, and in fact required the experience of a colleague, someone who has worked on the software for a while. Together we built an understanding: me saying "this line of code does bad things" and him saying "that line of code should not be run, nor should its function be called (with this particular data)". With that information, I could track upwards in the call stack to find the true problem.

It is a pretty good feeling, solving this kind of problem.

* * * *

On the homeward commute, I got out the "Starting FORTH" book, and the guy across the aisle on the train commented on the book. We talked for the entire trip home, discussing programming, technology, project management, and society in general.

I hadn't thought of Lee Brodie's book as a conversation-starter, but it was.

No comments:

Post a Comment