Bill Roper (billroper) wrote,
Bill Roper

All Locked Up and No Place To Go

There was a reason that I rewrote the locking on our server app. In fact, there were several reasons that I rewrote the locking on our server app, although they all boiled down to the following:

  • Never let someone have access to an object that they haven't locked.
  • Keep it simple, stupid.

    Now the locking scheme that I implemented wasn't completely bulletproof -- you could, for instance, obtain a pointer to an object along with the matching lock and then unlock the object while retaining the cached pointer. I could have written a lot of code to prevent this.

    Or I could just look at the programmer sadly and say, "If you did this, I cannot help you."

    My locking scheme works well, but -- really unfortunately -- it hasn't quite managed to make it to the tip of the code line. I'm hoping that it will shortly escape into the wild. (Actually, it's available in a patch to an older version. But it was a 700+ file merge to bring all of those changes forward to the current release and that requires an unfortunate amount of testing before it should be set free.)

    One of the problems with the old locking scheme was that it was not simple at all. In our continuing efforts to "fix" it, we piled multiple read/write locks onto the same poor, overburdened object. There was the read/write lock. There was the cache lock, which was intended to keep documents from being purged from memory. There was the loading lock, indicating that we were trying to load a document into memory. And there was the horrid (and originally my idea) intent-to-write lock which was really intended to manage long-lived reservations of a resource for a running batch process such as consolidation.

    Eep. So when I rewrote locking, I threw out all of the locks except for a simple read/write lock (which, after reading the literature some more, I am considering just reducing to an even simpler write lock and seeing how it works) and the intent-to-write lock, which was converted from a mutex to a check-out type lock, since we already used check out to reserve resources for long periods of time.

    And after I bashed all of the code to fit, things worked pretty well.

    Unfortunately, today I am debugging some of the old locking code which seems to be giving us a problem in our consolidation. And I find that I am attempting to lock a mutex that controls a resource that has just been deleted out from under me somehow between the time that I got the pointer to the mutex and the time that I actually tried to take the lock.

    I could probably fix this.

    All I have to do is rewrite the locking scheme.

    *thud* *thud* *thud*
  • Tags: musings, work

    • Steps

      Today is better than yesterday. Still short on solutions, but maybe I can get some time to sort things out. Thanks, folks.

    • Well, That Escalated Quickly

      Today could have been better. Much better.

    • The More You Know

      The problem with getting older is that -- although you may know more than you did when you were younger -- it seems like there are more and more…

    • Post a new comment


      Anonymous comments are disabled in this journal

      default userpic

      Your reply will be screened

      Your IP address will be recorded