February 12th, 2013

Mother, May I?

So I have been condemned to support a testing web page that someone else wrote long ago and that we have done a lousy job of maintaining. One of the things that the testing web page does is to check out a file from our file server, write it to a local directory, then check the file back in, and finally delete it from the local directory.

It stopped working with this release. The file was checked out from the file server, but not written to the local directory, so it could never be checked back in.

Oh, good. (I originally typed "Oh, god", which was perhaps a pretty accurate Freudian typing slip.)

So first, I discovered that I hadn't upgraded the solution that built the web page from .NET 2.0 (which we had used for previous releases) to .NET 4.0 (which we use for the current release and which I'd upgraded the other solutions for already. I blame it on the fact that the solution was named "OldWebServices.sln", leading me to believe it was obsolete. And except for this one testing web page, I was right.)

Ok. Upgrade all of the projects in the solution to use .NET 4.0. And it ran under the debugger for me successfully, checked out the file, and checked it back in. Check in the changes and have it built for the testers.

And it fails in exactly the same way it did before.

I'm running Windows Server 2003 on my box, so with some help from the testers, we got IIS configured on my box. I ran the "Publish Web Site" command, hooked the published web site up to IIS, and I successfully checked out the file and checked it back in. I checked in the changes and had it built for the testers.

And it failed in exactly the same way it did before.

The problem is that the build process doesn't run "Publish Web Site". But there are Website Deployment Projects. Except I can't read in the .wdproj files that specify them.

For this version of the software, we had upgraded to Visual Studio 2010. By default, VS2010 doesn't support Website Deployment Projects. You have to download and install a special plug in.

I did. And I added the old .wdproj file back into the solution and tried to build the deployment project for the testing web page.

It failed to build, because the old .wdproj file was for .NET 2.0 and we had upgraded to .NET 4.0.

As nearly as I can tell, there is no way to upgrade the .wdproj file to work with .NET 4.0. So I deleted it and recreated it. And I built the solution, the Website Deployment Project built, I hooked up the result to IIS, and it ran perfectly.

And I thought I checked everything in. Except ClearCase didn't tell me that the .wdproj file needed to be checked in, because there was a Discordance. So although I updated the build process files to use the results of building the Website Deployment Project I'd created, they didn't build it, because the changes hadn't been checked in due to the Discordance.

When I found the Discordance, I tried to fix it with ClearCase. ClearCase promptly ate all of my changes. And so I made all of the changes again, checked in the file, and asked to have the build updated with the results.

And it failed on the test machine in exactly the same way it did before.

I suggested that the problem had to be in either the build process or the configuration of the test machine, because there clearly wasn't a code problem since the bloody thing ran on my machine. And the head of our testing group informed me that he guaranteed that it wasn't a install/config problem on the test environment.

At this point, my boss handed me a different VM with source code for this mess on it. It was a 64-bit version of Windows Server 2008 R2, not my 32-bit version of Windows Server 2003, but, hey! Why not?

And I built the whole thing out there as a 64-bit app and it failed the same way that it failed on the test machine.

Ok, at least I could debug the mess now that I'd built out the app properly. (I'd tried debugging it earlier and it couldn't be done, because everything was mismatched between the source code and the version that had been installed from the build machine.)

Trace, trace, trace. Ah, there's the code that creates the file. And it returns an error code. Plug that into the error lookup routine.

"Access denied."

That's odd. Check the directory where I'm trying to write the file. Yes, I can create a file there. Yes, I can delete a file there. Hmm.

So I started doing some more reading about IIS, since I know next to nothing about it.

And it turns out that when you're running an ASP.NET web page, you run it as a user specified by IIS. Under IIS 6.0 and earlier -- which would be the version that is on my machine -- the user is the NETWORK SERVICE user. He is allowed to do many things, including creating and deleting files in most directories, including the one where I was trying to write the file.

In later versions of IIS -- which would be the version on the test machine where things have been failing consistently -- the user is a member of the IIS_IUSRS group. The IIS_IUSRS group is not allowed to run around reading and writing files in directories willy nilly, which is indeed a likely improvement to security.

So when we check out a file from our file server and try to write it to that local directory, we get back an "Access denied" result. Whoever originally wrote the code for our Gateway DLL managed to do it in such a way that the error never got reported back anywhere that I could find, so there was no message, just a silent and deadly failure.

*grumble*

I gave the IIS_IUSRS group "Full Control" permission on the directory where we were trying to write the file on my boss' VM.

And the file checked out and wrote to the directory successfully.

I believe that I have now proven that we have a configuration error on the test machine. There may still be other errors, but I can pretty much guarantee that there's a configuration error.

I think I'll go watch Castle with Gretchen now...