The Daily Parker

Politics, Weather, Photography, and the Dog

Putting a bow on it

We're just 45 minutes from releasing a software project to our client for user acceptance testing (UAT), and we're ready. (Of course, there are those 38 "known issues..." But that's what the UAT period is for!)

When I get back from the launch meeting, I'll want to check these out:

Off to the client. Then...bug fixes!

Performance improvement; or, how one line of code can change your life

I'm in the home stretch moving Weather Now to Azure. I've finished the data model, data retrieval code, integration with the existing UI, and the code that parses incoming weather data from NOAA, so now I'm working on inserting that data into the database.

To speed up development, improve the design, and generally make my life easier, I'm using Entity Framework 5.0 with database-first modeling. The problem that consumed me yesterday afternoon and on into this morning has been how to ramp up to realistic volumes of data.

The Worker Role that will go out to NOAA and put weather data where Weather Now can use it will receive somewhere around 60,000 weather reports every hour. Often, NOAA repeats reports; sometimes, NOAA sends truncated copies of reports; sometimes, NOAA sends garbled reports. The GetWeather application (soon to be Azure worker task) has to handle all of that and still function in bursts of up to 10,000 weather reports at once.

The WeatherStore class takes parsed METARs and stores them in the CurrentObservations, PastObservations, and ClimateObservations tables, as appropriate. As I've developed the class, I've written unit tests for each kind of thing it has to do: "Store single report," "Store many reports" (which tests batching them up and inserting them in smaller chunks), "Store duplicate reports," etc. Then yesterday afternoon I wrote an integration test called "Store real-life NOAA file" that took the 600 KB, 25,000-line, 6,077-METAR update NOAA published at 2013-01-01 00:00 UTC, and stuffed it in the database.

Sucker took 900 seconds—15 minutes. In real life, that would mean a complete collapse of the application, because new files come in about every 4 minutes and contain similarly thousands of lines to parse.

This morning, I attached JetBrains dotTrace to the unit test (easy to do since JetBrains ReSharper was running the test), and discovered that 90% of the method's time was spent in—wait for it—DbContext.SaveChanges(). As I dug through the line-by-line tracing, it was obvious Entity Framework was the problem.

I'll save you the steps to figure it out, except to say Stack Overflow is the best thing to happen to software development since the keyboard.

Here's the solution:

using (var db = new AppDataContext())
	db.Configuration.AutoDetectChangesEnabled = false;

// do interesting work


The result: The unit test duration went from 900 seconds to...15. And that is completely acceptable. Total time spent on this performance improvement: 1.25 hours.

Chaining LINQ predicates

I've spent a good bit of free time lately working on migrating Weather Now to Azure. Part of this includes rewriting its Gazetteer, or catalog of places that it uses to find weather stations for users. For this version I'm using Entity Framework 5.0, which in turn allows me to use LINQ extensively.

I always try to avoid duplicating code, and I always try to write sufficient unit tests to prevent (and fix) any coding errors I make. (I also use ReSharper and Visual Studio Code Analysis to keep me honest.)

There are two methods in the Gazetteer's PlaceFinder class that search for places by distance. The prototypes are:

public static IEnumerable FindNearby(ILocatable center, Length radius)


public static IEnumerable FindNearby(ILocatable center, Length radius, Expression<Func<Place, bool>> predicate)

But in order for the first method to work, it has to create a predicate of its own to draw a box around the center location. (The ILocatable interface requires Latitude and Longitude. Length is a class in the Inner Drive Extensible Architecture representing a measurable two-dimensional distance.) So in order for the second method to work, it has to chain predicates.

Fortunately, I found Joe and Ben Albahari's library of LINQ extensions. Here's the second method:

public static IEnumerable<PlaceDistance> FindNearby(
	ILocatable center,
	Length radius,
	Expression<Func<Place, bool>> predicate)
	var searchPredicate = 
		SearchDistancePredicate(center, radius)

	var places = Find(searchPredicate);

	return SearchDistanceResults(places, center, radius);

This allows me to use a single Find method that takes a predicate, engages a retry policy, and returns exactly what I'm looking for. And it allows me to do this, which just blows my mind:

var results = PlaceFinder.FindNearby(TestNode, TestRadius, p => p.Feature.Name == "airport");

Compared with the way Weather Now works under the hood right now, and how much coding the existing code took to achieve the same results, I'm just stunned. And it will make migrating Weather Now a lot easier.

Upgrading to Azure Storage Client 2.0

Oh, Azure Storage team, why did you break everything?

I love upgrades. I really do. So when Microsoft released the new version of the Windows Azure SDK (October 2012, v1.8) along with a full upgrade of the Storage Client (to 2.0), I found a little side project to upgrade, and went straight to the NuGet Package Manager for my prize.

I should say that part of my interest came from wanting to use some of the .NET 4.5 features, including the asynchronous helper methods, HTML 5, and native support for SQL 2012 spatial types, in the new version of Weather Now that I hope to complete before year's end. The Azure SDK 1.8 supports .NET 4.5; previous version didn’t. And the Azure SDK 1.8 includes a new version of the Azure Emulator which supports 4.5 as well.

To support the new, Azure-based version (and to support a bunch of other projects that I migrated to Azure), I have a class library of façades supporting Azure. Fortunately, this architecture encapsulated all of my Azure Storage calls. Unfortunately, the upgrade broke every other line of code in the library.

0. Many have the namespaces have changed. But of course, you use ReSharper, which makes the problem go away.

1.The CloudStorageAccount.FromConfigurationSetting() method is gone. Instead, you have to use CloudStorageAccount.Parse(). Here is a the delta from TortoiseHg:

- _cloudStorageAccount = CloudStorageAccount.FromConfigurationSetting(storageSettingName);
+ var setting = CloudConfigurationManager.GetSetting(storageSettingName);
+ _cloudStorageAccount = CloudStorageAccount.Parse(setting);

2. BlobContainer.GetBlobReference() is gone, too. Instead of getting a generic IBlobContainer reference back, you have to specify whether you want a page blob or a block blob. In this app, I only use page blobs, so the delta looks like this:

- var blob = _blobContainer.GetBlobReference(blobName);
+ var blob = _blobContainer.GetBlockBlobReference(blobName);

Note that BlobContainer also has a GetPageBlobReference() method. It also has a nearly-useless GetBlobReferenceFromServer method that throws a 404 error if the blob doesn’t exist, which makes it useless for creating new blobs.

3. Blob.DeleteIfExists() works somewhat differently, too:

- var blob = _blobContainer.GetBlobReference(blobName);
- blob.DeleteIfExists(new BlobRequestOptions 
-	{ DeleteSnapshotsOption = DeleteSnapshotsOption.IncludeSnapshots });
+ var blob = _blobContainer.GetBlockBlobReference(blobName);
+ blob.DeleteIfExists();

4. Remember downloading text directly from a blob using Blob.DownloadText()? Yeah, that’s gone too. Blobs are all about streams now:

- var blob = _blobContainer.GetBlobReference(blobName);
- return blob.DownloadText();
+ using (var stream = new MemoryStream())
+ {
+ 	var blob = _blobContainer.GetBlockBlobReference(blobName);
+ 	blob.DownloadToStream(stream);
+ 	using (var reader = new StreamReader(stream, true))
+ 	{
+ 		stream.Position = 0;
+ 		return reader.ReadToEnd();
+ 	}
+ }

5. Because blobs are all stream-based now, you can’t simply upload files to them. Here’s the correction to the disappearance of Blob.UploadFile():

- var blob = _blobContainer.GetBlobReference(blobName);
- blob.UploadByteArray(value);
+ var blob = _blobContainer.GetBlockBlobReference(blobName);
+ using (var stream = new MemoryStream(value))
+ {
+ 	blob.UploadFromStream(stream);
+ }

6. Microsoft even helpfully corrected a spelling error which, yes, broke my code:

- _blobContainer.CreateIfNotExist();
+ _blobContainer.CreateIfNotExists();

Yes, if not existS. Notice the big red S, which is something I’d like to give the Azure team after this upgrade.*

7. We’re not done, yet. They fixed a "problem" with tables, too:

  var cloudTableClient = _cloudStorageAccount.CreateCloudTableClient();
- cloudTableClient.CreateTableIfNotExist(TableName);
- var context = cloudTableClient.GetDataServiceContext();
+ var table = cloudTableClient.GetTableReference(TableName);
+ table.CreateIfNotExists();
+ var context = cloudTableClient.GetTableServiceContext();

8. Finally, if you have used the CloudStorageAccount.SetConfigurationSettingPublisher() method, that’s gone too, but you don’t need it. Instead, use the CloudConfigurationManager.GetSetting() method directly. Instead of doing this:

if (RoleEnvironment.IsAvailable)
		(configName, configSetter) => 
		(configName, configSetter) => 

You can simply do this:

var someSetting = CloudConfigurationManager.GetSetting(settingKey);

The CloudConfiguration.GetSetting() method first tries to get the setting from Azure, then from the ConfigurationManager (i.e., local settings).

I hope I have just saved you three hours of silently cursing Microsoft’s Azure Storage team.

* Apologies to Bill Cosby.

Starting the oldest item on my to-do list

I mentioned a few weeks ago that I've had some difficulty moving the last remaining web application in the Inner Drive Technology Worldwide Data Center, Weather Now, into Microsoft Windows Azure. Actually, I have two principal difficulties: first, I need to re-write almost all of it, to end its dependency on a Database of Unusual Size; and second, I need the time to do this.

Right now, the databases hold about 2 Gb of geographic information and another 20 Gb of archival weather data. Since these databases run on my own hardware right now, I don't have to pay for them outside of the server's electricity costs. In Azure, that amount of database space costs more than $70 per month, well above the $25 or so my database server costs me.

I've finally figured out the architecture changes needed to get the geographic and weather information into cheaper (or free) repositories. Some of the strategy involves not storing the information at all, and some will use the orders-of-magnitude-less-expensive Azure table storage. (In Azure storage, 25 Gb costs $3 per month.)

Unfortunately for me, the data layer is about 80% of the application, including the automated processes that go out and get weather data. So, to solve this problem, I need a ground-up re-write.

The other problem: time. Last month, I worked 224 hours, which doesn't include commuting (24 hours), traveling (34 hours), or even walking Parker (14 hours). About my only downtime was during that 34 hours of traveling and while sitting in pubs in London and Cardiff.

I have to start doing this, though, because I'm spending way too much money running two servers that do very little. And I've been looking forward to it—it's not a chore, it's fun.

Not to mention, it means I get to start working on the oldest item on my to-do list, Case 46 ("Create new Gazetteer database design"), opened 30 August 2006, two days before I adopted Parker.

And so it begins.

I wish stuff just worked

Despite my enthusiasm for Microsoft Windows Azure, in some ways it suffers from the same problem all Microsoft version 1 products have: incomplete debugging tools.

I've spent the last three hours trying to add an SSL certificate to an existing Azure Web application. In previous attempts with different applications, this has taken me about 30 minutes, start to finish.

Right now, however, the site won't launch at all in my Azure emulator, presenting a generic "Internal server error - 500" when I try to start the application. The emulator isn't hitting any of my code, however, nor is it logging anything to the Windows System or Application logs. So I have no idea why it's failing.

I've checked the code into source control and built it on another machine, where it had exactly the same problem. So I know it's something under source control. I just don't know what.

I hate very little in this world, but lazy developers who fail to provide debugging information bring me near to violence. A simple error stack would probably lead me to the answer in seconds.

Update: The problem was in the web.config file.

Earlier, I copied a connection string element from a transformation file into the master web.config file, but I forgot to remove the transformation attributes xdt:Transform="Replace" and xdt:Locator="Match(name)". This prevented the IIS emulator from parsing the configuration file, which caused the 500 error.

I must reiterate, however, that some lazy developer neglected to provide this simple piece of debugging information, and my afternoon was wasted as a result.

It reminds me of a scene in Terry Pratchett's and Neil Gaiman's Good Omens (one of the funniest books ever written). Three demons are comparing notes on how they have worked corruption on the souls of men. The first two have each spent years tempting a priest and corrupting a politician. Crowley's turn:

"I tied up every portable telephone system in Central London for forty-five minutes at lunchtime," he said.

"Yes?" said Hastur. "And then what?"

"Look, it wasn't easy," said Crowley.

"That's all?" said Ligur.

"Look, people—"

"And exactly what has that done to secure souls for our master?" said Hastur.

Crowley pulled himself together.

What could he tell them? That twenty thousand people got bloody furious? That you could hear the arteries clanging shut all around the city? And that then they went back and took it out on their secretaries or traffic wardens or whatever, and they took it out on other people? In all kinds of vindictive little ways which, and here was the good bit, they thought up themselves. The pass-along effects were incalculable. Thousands and thousands of souls all got a faint patina of tarnish, and you hardly have to lift a finger.

Somehow, debugging the Azure emulator made me think of Crowley, who no doubt helped Microsoft write the thing.

W-8 a second...

After installing Windows 8 yesterday, I discovered some interaction problems with my main tool, Visual Studio 2012. Debugging Azure has suddenly become difficult. So after installing the OS upgrade, I spent about five hours re-installing or repairing a whole bunch of other apps, and I'm not even sure I found the causes of the problems.

The next step is to install new WiFi drivers. But seriously, I'm only a few troubleshooting steps from rebuilding the computer from scratch back on Windows 7.

Cue the cursing...

W-8, W-8!

This morning I installed Microsoft Windows 8 on my laptop. As a professional geek, getting software after it's released to manufacturing but before the general public is a favorite part of my job.

It took almost no effort to set up, and I figured out the interface in just a few minutes. I like the new look, especially the active content on the Start screen. It definitely has a more mobile-computing look than previous Windows versions, with larger click targets (optimized for touch screens) and tons of integration with Windows Accounts. I haven't linked much to my LiveID yet, as I don't really want to share that much with Microsoft, but I'll need it to use SkyDrive and to rate and review the new features.

I also did laundry, vacuumed, cleaned out all my old programming books (anyone want a copy of Inside C# 2 from 2002?), and will now go shopping. And I promise never to share that level of picayune personal detail again on this blog.

Why you should always "sleep on it"

If one of the developers on one of my teams had done this, I would have (a) told him to get some sleep and (b) mocked him for at least a week afterwards.

Saturday night I spent four hours trying to figure out why something that worked perfectly in my local Azure emulator failed with a cryptic "One of the request inputs is out of range" message in the Cloud. I even posted to StackOverflow for help.

This morning, I spent about 90 minutes building a sample Cloud application up from scratch, adding one component at a time until I got to the same failure. And, eventually, I got to the same failure. Then I stepped through the code to figure out

And I immediately saw why.

The problem turned out to be this: I have two settings:

    <?xml version="1.0" encoding="utf-8"?>
    <ServiceDefinition name="Cloud" ...>
      <WebRole name="WebRole" vmsize="Small">
          <Setting name="MessagesConfigurationBlobName" />
          <Setting name="MessagesConfigurationBlobContainerName" />

Here's the local (emulator) configuration file:

    <?xml version="1.0" encoding="utf-8"?>
    <ServiceConfiguration ...>
      <Role name="WebRole">
          <Setting name="MessagesConfigurationBlobName" value="LocalMessageConfig.xml"/>
          <Setting name="MessagesConfigurationBlobContainerName" value="containername"/>
      </Role >

Here's the Cloud file:

    <?xml version="1.0" encoding="utf-8"?>
    <ServiceConfiguration ...>
      <Role name="WebRole">
          <Setting name="MessagesConfigurationBlobName" value="containername" />
          <Setting name="MessagesConfigurationBlobContainerName" value="CloudMessageConfig.xml"/>
      </Role >

I will now have a good cry and adjust my time tracking (at 3am Saturday) from "Emergency client work" to "Developer PEBCAK".

The moral of the story is, when identical code fails in one environment and succeeds in another, don't just compare the environments, compare *everything that could be different in your own code* between the environments.

Oh, and don't try to deploy software at 3am. Ever.

Deployments are fun!

In every developer's life, there comes a time when he has to take all the software he's written on his laptop and put it into a testing environment. Microsoft Azure Tools make this really, really easy—every time after the first.

Today I did one of those first-time deployments, sending a client's Version 2 up into the cloud for the first time. And I discovered, as predicted, a flurry of minor differences between my development environment (on my own computer) and the testing environment (in an Azure web site). I found five bugs, all of them minor, and almost all of them requiring me to wipe out the test database and start over.

It's kind of like when you go to your strict Aunt Bertha's house—you know, the super-religious aunt who has no sense of humor and who smacks your hands with a ruler every time you say something harsher than "oops."

End of complaint. Back to the Story of D'Oh.