Bill Roper (billroper) wrote,
Bill Roper
billroper

The Digging Continues

So you remember the locking fixes that I've been working on for our server since August or thereabouts? The ones that were "finished"?

Yeah, not quite. My changes got through five days of successful Uptime testing (in that the server did not crash), but then the tester clicked on the tab in our admin app that would show the transaction results and the server GPFd. Sadly, we don't know where, because no one was logged in on the server machine, so the just-in-time debugging didn't kick in, but the logical assumption is that something was wrong in the code that would have returned the transaction results.

I cleaned that up, then spent some time looking at other code in the class that manages the transactions. Finding at least one locking error, I decided to rewrite it and bulletproof it, as much as is possible. That's almost done.

In the meantime, we're not entirely happy with the memory profile that the version with the locking fixes was showing, so our QE team went back to run the Uptime test against the baseline version.

And the server crashed within a day. More than once, as they restarted it and tried again.

I feel somewhat vindicated. :)
Tags: musings, work
Subscribe
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 2 comments