The Daily Parker

Politics, Weather, Photography, and the Dog

Performance improvement; or, how one line of code can change your life

I'm in the home stretch moving Weather Now to Azure. I've finished the data model, data retrieval code, integration with the existing UI, and the code that parses incoming weather data from NOAA, so now I'm working on inserting that data into the database.

To speed up development, improve the design, and generally make my life easier, I'm using Entity Framework 5.0 with database-first modeling. The problem that consumed me yesterday afternoon and on into this morning has been how to ramp up to realistic volumes of data.

The Worker Role that will go out to NOAA and put weather data where Weather Now can use it will receive somewhere around 60,000 weather reports every hour. Often, NOAA repeats reports; sometimes, NOAA sends truncated copies of reports; sometimes, NOAA sends garbled reports. The GetWeather application (soon to be Azure worker task) has to handle all of that and still function in bursts of up to 10,000 weather reports at once.

The WeatherStore class takes parsed METARs and stores them in the CurrentObservations, PastObservations, and ClimateObservations tables, as appropriate. As I've developed the class, I've written unit tests for each kind of thing it has to do: "Store single report," "Store many reports" (which tests batching them up and inserting them in smaller chunks), "Store duplicate reports," etc. Then yesterday afternoon I wrote an integration test called "Store real-life NOAA file" that took the 600 KB, 25,000-line, 6,077-METAR update NOAA published at 2013-01-01 00:00 UTC, and stuffed it in the database.

Sucker took 900 seconds—15 minutes. In real life, that would mean a complete collapse of the application, because new files come in about every 4 minutes and contain similarly thousands of lines to parse.

This morning, I attached JetBrains dotTrace to the unit test (easy to do since JetBrains ReSharper was running the test), and discovered that 90% of the method's time was spent in—wait for it—DbContext.SaveChanges(). As I dug through the line-by-line tracing, it was obvious Entity Framework was the problem.

I'll save you the steps to figure it out, except to say Stack Overflow is the best thing to happen to software development since the keyboard.

Here's the solution:

using (var db = new AppDataContext())
{
	db.Configuration.AutoDetectChangesEnabled = false;

// do interesting work

	db.SaveChanges();
}

The result: The unit test duration went from 900 seconds to...15. And that is completely acceptable. Total time spent on this performance improvement: 1.25 hours.

Another reason to finish moving to Azure

As I've noted before, only one Web application still lives in my living room the Inner Drive Technology Worldwide Data Center: Weather Now. In the last few days, it's showing one more good reason that it needs to get to Windows Azure pronto.

Take a look at my Google Analytics view of incoming visitors:

What is going on? How do I go from 300 daily unique visitors to 1,800 in two days? Take a look at where they're coming from:

Yes, that's right. Close to 40% of Weather Now's traffic came from the Yukon Territory yesterday. And another 40% came from Alaska. And they're all going to this page for some reason. This might be why:

So how does Azure enter into it? Simply, if you have a Web application running on your own server, and you get a 750% increase in traffic, your server may not be able to handle it. Or, worse in a way, you might have been running the server capable of handling the peak load all the time, at great expense in electricity and hardware.

With Azure, you can simply bring another instance online, or increase the size of your running instance, or do any number of things to adapt quickly to the increased load, without having to buy or move the hardware. Then, when the load returns to normal, you can spin down the idle capacity. The trick is, you only pay for the capacity you're actually using.

I'm getting a lot closer to moving Weather Now, but a deadline looming at my paying job tomorrow has my attention at the moment. So more on this stuff later. Meanwhile, if you're in the Yukon or in central Alaska, stay warm, folks!

Debugging our first Azure 1.8 deployment

I've just spent three hours debugging something caused by a single missing line in a configuration file.

At 10th Magnitude, we've recently upgraded our framework and reference applications to the latest Windows Azure SDK. Since I'd already done it once, it didn't take too desperately long to create the new versions of our stuff.

However, the fact that something works in an emulator does not mean it will actually work in production. So, last night, our CTO attempted to deploy the first application we built with the new stuff out to Azure. It failed.

First, all we got was a HttpException, which is what ASP.NET MVC throws when something fails on a Razor view. The offending line was this:

@{ 
   ViewBag.Title = Html.Resource("PageTitle");
}

This extension method indirectly calls our custom resource provider, cleverly obfuscated as SqlResourceProvider, which then looks up the string resource in a SQL database.

My first problem was to get to the actual exception being thrown. That required me to RDP into the running Web role, open a view (I chose About.cshtml because it was essentially empty), and replace the code above with this:

@using System.Globalization
@{
  try
  {
    var provider = new SqlResourceProvider("/Views/Home/About.cshtml");
    var title = provider.GetObject("PageTitle", CultureInfo.CurrentUICulture);
    ViewBag.Title = title;
  }
  catch (Exception ex)
  {
    ViewBag.Error = ex + Environment.NewLine + "Base:" + Environment.NewLine + ex.GetBaseException();
  }
}
<pre>@ViewBag.Error</pre>

That got me the real error stack, whose relevant lines were right at the top:

System.IO.FileNotFoundException: Could not load file or assembly 'Microsoft.WindowsAzure.ServiceRuntime, Version=1.7.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified.
File name: 'Microsoft.WindowsAzure.ServiceRuntime, Version=1.7.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35'
at XM.UI.ResourceProviders.ResourceCache.LogDebug(String message)

Flash forward an hour of reading and testing things. I'll spare you. The solution is to add a second binding redirect in web.config:

<dependentAssembly>
  <assemblyIdentity name="Microsoft.WindowsAzure.ServiceRuntime" 
    publicKeyToken="31bf3856ad364e35" culture="neutral" />
  <bindingRedirect oldVersion="0.0.0.0-1.0.0.0" newVersion="1.0.0.0" />
  <bindingRedirect oldVersion="1.1.0.0-1.8.0.0" newVersion="1.8.0.0" />
</dependentAssembly>

Notice the second line? That tells .NET to refer all requests for the service runtime to the 1.8 version.

Also, in the Web application, you have to set the assembly references for Microsoft.WindowsAzure.Configuration and Microsoft.WindowsAzure.Storage to avoid using specific versions. In Solution Explorer, under the References folder for the web app, find the assemblies in question, view Properties, and set Specific Version to false.

I hope I have saved you three hours of your life. I will now go back to my deployment, already in progress...

Update, an hour and a half later: It turns out, there's a difference in behavior between <compilation debug="true"> and <compilation> on Azure Guest OS 3 (Windows Server 2012) that did not exist in previous guest OS versions. When an application is in debug mode on Azure Guest OS 3, it ignores some errors. Specifically, it ignores the FileNotFoundException thrown when Bundle.JavaScript().Add() has the wrong version number for the script it's trying to add. In Release mode, it just barfs up a 500 response. That is maddening—especially when you're trying to debug something else. At least it let our app log the error, eventually.

Build, buy, or rent?

10th Magnitude's CTO, Steve Harshbarger, explains how the cloud makes economics better by giving us more options:

We know we could build every feature of a custom application from the ground up. We get ultimate control of the result, but often the cost or timeframe to do so is prohibitive. So as developers, we look to incorporate pre-built components to speed things along. Not only that, we strive for better functionality by incorporating specialized components that others have already invested far more resources in than we ever could for a single application. As a simple example, who would ever write a graphing engine from scratch with so many great ones out there? So, build is rarely the whole story.

What about buy? I think of “buy” not in a strict monetary sense, but as a moniker for code or components that get pulled into the physical boundary of your application. This includes both open source components and commercial products, in the form of source code you pull into your project, or binaries you install and run with your applications’ infrastructure. We all do this all the time.

But the cloud brings a third option to the table: rent. I define this as a service you integrate with via some API, which runs outside your application’s physical boundary. This is where smart developers see an opportunity to shave more time and cost off of projects while maintaining—or even increasing—the quality of functionality.

He also lists our top-10 third-party "rental" services, including Postmark, Pingdom, and Arrow Payments. (I'm using a couple of them as well.)

All done with the code reorg

Well, that was fun. I've just spent the last three days organizing, upgrading, and repackaging 9,400 lines of code in umpteen objects into two separate assemblies. Plus I upgraded the assemblies to all the latest cool stuff, like Azure Storage Client 2.0 and...well, stuff.

It's getting dark on the afternoon before the U.S. Thanksgiving holiday, and I'm a little fried. Goodbye, 10th Magnitude Office, until Monday.