The Daily Parker

Politics, Weather, Photography, and the Dog

Unethical offer of the month

"Leading e-commerce development and acquisition group" KASA Capital sent me this email over the weekend:

I'd like to contribute an article to your site, thedailyparker.com - I can select a topic that matches the tone and theme of your site, or if you prefer, I can write about something of your choosing. The article will be unique and interesting to read. In return, I ask that I be able to subtly include a link to my site ____ within the article.

If you are able to put a permanent link to the article in a prominent place on your website, I may be able to make a one time Paypal donation as well.

Sure. Just a couple of things. First, the article you submit will have your byline. Second, the article will clearly state the financial relationship you have to the website you're "subtly" promoting. Third, the post containing the article will note that the article is "paid advertising." Finally, the article will end with a link to this post, to ensure that readers don't confuse your paid advertising content with anything I've ever written. If these conditions are acceptable, the fee for publishing your article will be $2,500.

Thanks for the offer, guys.

Chirp

I'm at a client site today and tomorrow, jamming on database optimization. Expect regular posts to resume Friday.

Correcting the record

Reader AT actually met Tom Shanks, the chief programmer behind the ACS Atlases, and corrects my understanding of how the ACS team put it together:

Contrary to what you assume in your post, Tom Shanks did not hack his atlas into an Apple II. ACS was rather professional in their IT. The worldwide city database with longitude and latitude they had licensed from on of the big atlas (map atlas) publishers, if I remember correctly Rand McNally. The timezone history data they had collected from numerous published sources..., and partly also from field research and grass root contributions by their own astrology service clients.

The reader also gave me the story about an ongoing effort to extend the tzinfo database, and the provenance of Astrolabe's alleged copyrights. (Note that this is the precise, legal meaning of "alleged:" in a civil complaint, just as in a criminal complaint, the parties "allege" each fact in their filings.) If the reader gives me permission, I'll post some of this information.

New documentation of an old feature

The Inner Drive Extensible Architecture™ has had support for the tzinfo database for several years now. Weather Now uses it; so do a few of my clients.

Like the lazy software developer I am, however, I never put up a decent demonstration of the code, which might, you know, make someone want to buy it.

Well, the documentation, she is here. Licensing, you will be shocked to learn, is available for a modest fee.

Analysis of Shanks' atlases against the tzinfo database

To better understand the facts behind Astrolabe’s stupid trolling quixotic lawsuit against the guys who coordinated the worldwide time-zone database (tzinfo), I bought copies of the Shanks Amercian and International atlases that Astrolabe claims to own. (I went through the secondary market, so I didn’t actually give Astrolabe any money.)

First, an update. According to Thomas Eubanks of the IETF, the Electronic Frontier Foundation has taken over Arthur Olson’s legal defense. Mazel tov. I expect to see a response to the complaint against him in a few weeks that includes a motion to dismiss which, I think, may be granted. (I’m thinking about drafting a response myself, just to exercise my legal muscles properly. Watch this space.)

Now to the main post. The Shanks books, rather than containing maps, contain pages and pages of tabular data showing three things:

  1. Names of first- and second-order administrative districts (e.g., states and counties);
  2. Latitudes and longitudes of cities and other named places within countries or states;
  3. Lists of the dates and times of time-zone changes for those named places.

The atlases dump this data to paper using a monospace typeface at 6-point size in what can be nothing other than 1990s-era printouts to some kind of publishing compositing software. This offers a clue to the “original works” claim that Astrolabe makes about the products.

Shanks certainly put a lot of work into these books, especially considering he first published them in 1978 and 1985. He must have spent hundreds of hours looking up and entering data on the thousands of locations in the tables.

For example, the American Atlas contains rows upon rows of data like this:

Chicago 16      1 41N51'00 87W39'00 5:50:36
Chicago Heights 16
                1 41N30'22 87W38'08 5:50:33
Chicago Lawn 16 1 41N47    87W43    5:50:52

I imagine Shanks looked up this data in reference books, then entered it into a home-grown flat-field database through a Vax terminal or on his Apple ][+. I hope he at least let the computer calculate the last column (the location’s offset from GMT), since it’s derived directly from the location’s longitude (the next-to-last column).

I imagine this because, in the early 1990s, I did something similar to study climate data. (Do you know how long it takes to enter 30 years of daily climate data by hand? No? You’re lucky.)

Back to Astrolabe’s complaint. In Count 4, Astrolabe claims ownership of “certain copyright-protected computer software programs and information contained therein…known as the ‘ACS Atlas,’ consisting of both the ‘ACS International Atlas,’ and the ‘ACS American Atlas,’ in the form of computer software program(s) and/or data bases, and in the form of electronic output and future electronic media from said programs....’ ” I infer from the complaint that the software reproduces the books in computer-searchable form, or perhaps contains the raw data that Shanks himself used to produce the books.

I’ll defer my main argument for a moment to speculate further on what parts of the tzinfo database could have copied the Shanks database.

In the tzinfo database, one of the files (zone.tab) contains latitudes and longitudes of locations in this form:

US	+415100-0873900	America/Chicago	Central Time
US	+375711-0864541	America/Indiana/Tell_City	Central Time - Indiana - Perry County
US	+411745-0863730	America/Indiana/Knox	Central Time - Indiana - Starke County

Nothing else in the tzinfo database comes as close to looking like data in the Shanks atlases. I don’t know where the tzinfo list came from, but I suspect it came from public sources like the Census Bureau.

The other possible copying comes from the lists of dates Shanks put together that look like this:

IL # 1
Before 11/18/1883  LMT
11/18/1883  12:00  CST
 3/31/1918  02:00  CWT
10/27/1918  02:00  CST

Here’s how the tzinfo database shows the same information:

# Rule	NAME	FROM	TO	TYPE	IN	ON	AT	SAVE	LETTER
Rule	Chicago	1920	only	-	Jun	13	2:00	1:00	D
Rule	Chicago	1920	1921	-	Oct	lastSun	2:00	0	S
Rule	Chicago	1921	only	-	Mar	lastSun	2:00	1:00	D
Rule	Chicago	1922	1966	-	Apr	lastSun	2:00	1:00	D
Rule	Chicago	1922	1954	-	Sep	lastSun	2:00	0	S
Rule	Chicago	1955	1966	-	Oct	lastSun	2:00	0	S
# Zone	NAME		GMTOFF	RULES	FORMAT	[UNTIL]
Zone America/Chicago	-5:50:36 -	LMT	1883 Nov 18 12:09:24
			-6:00	US	C%sT	1920
			-6:00	Chicago	C%sT	1936 Mar  1 2:00
			-5:00	-	EST	1936 Nov 15 2:00
			-6:00	Chicago	C%sT	1942
			-6:00	US	C%sT	1946
			-6:00	Chicago	C%sT	1967
			-6:00	US	C%sT

That hot mess establishes the specific rules Chicago used to change its clocks in the 1920s and 1950s where the rules differed from the general U.S. rules, then it sets out the dates and times that Chicago’s wall-clock rule sets changed from the beginning of standard time in 1883 through the last change in 1967. (The current rule set for Chicago are the “US” rules, defined elsewhere in the database.)

Shanks has a list, and the tzinfo database has the rules to create the list. Shanks also has an error that the tzinfo database corrects: the tzinfo database establishes that Chicago switched from local mean time (LMT) to standard time at 12:09:24, because Chicago is 9 minutes and 24 seconds ahead of the standard meridian for the time zone. Shanks puts the time at 12 noon, because his list shows the target time, not the trigger time, for the rule change.

Did the tzinfo project use Shanks to determine the rules for time changes? Yes, explicitly, though for highly-documented locations like Chicago the project participants cross-referenced Shanks with original sources, often correcting his errors. But "use" does not mean "copy;" I can use all the baseball statistics I want out of the newspaper without ever copying the newspaper. Data is not protected by copyright.

The tzinfo didn’t infringe on anyone’s copyright because Shanks created very little to protect. As I’ve previously explained, facts and data do not enjoy copyright protection in the United States. Only the expression of facts does. So if the tzinfo project had photocopied Shanks’ atlases, or republished the ACS software wholesale, then perhaps there would be an infringement. But I think I’ve shown a bit of why the tzinfo project hasn’t done anything actionable.

The Shanks atlases are like meticulously hand-copied illuminated codices from the 16th century, years after Gutenberg made his Bible and made hand-copying obsolete. I’m glad Shanks did the work; I’m sure he felt like he’d accomplished something huge. I really admire the work that went into it, while at the same time shaking my head at the wasted effort. Because since the late 1990s, all that data—latitudes, longitudes, place names—has been available for free from the Census Bureau and the CIA.[1] Before around 1998, you couldn’t just download the data through FTP for free; you had to write a letter to the appropriate agency and pay for it. But being U.S. government data, it was in the public domain, so once you’d paid for it, you could republish it in an easier-to-use form and recoup royalties.

In an era before the Census Bureau started dumping terabytes of data to the Internet, Shanks’ atlases would have been incredibly convenient sources of geographic and time-zone data. Today, they’re curiosities, monuments to exactly the kind of mental effort obviated by fast, cheap computers and the Internet.

Poor Shanks, all those data, thousands of rows of it, standing nakedly, and often erroneously, on page after page of tables in two massive volumes, apparently not knowing that he could have gotten it from the U.S. government—you have to admire that work ethic.

Astrolabe, for its part, has degenerated into exactly the kind of mental deficiency reviled by those of us who actually create software for a living. I eagerly await their much-deserved legal defeat in the next few months.

[1] Yes, the CIA publishes tons of free data, from their World Factbook to entire databases of geospatial information.

Edited at 20:58 UTC: Clarified the difference between "use" and "copying."

More about Groupon's IPO

Yesterday the Tribune reported on Groupon scaling back their IPO, from which they had hoped to raise the equivalent of Norway's GDP. Today's Economist has more:

Groupon created a new market. This is a boon to consumers, but confers no lasting “first-mover” advantage on Groupon. Its business model is unpatentable and simple to replicate, so there are already more than 20 copycats.

Groupon aspires to be global, but the markets it serves are intensely local. Internet selling is best suited to “experience goods”. These are goods and services the quality of which you cannot judge until you experience them, such as haircuts and Thai meals, so there is no advantage in having a bricks-and-mortar shop for people to browse in. (In North America 83% of Groupon’s deals fall into this category.) The trouble with experience goods is that generally you cannot separate manufacture from delivery: you cannot cook a meal in Guangzhou and eat it in New York.

Groupon was, some may recall, the hottest company in Chicago, so of course I want the company to succeed. I've also had some experience with Internet start-ups, so watching Groupon the past couple of years has felt...familiar. In particular, I've seen what happens to companies that grow by an order of magnitude in only two years.

Another interesting tidbit, possibly related: Groupon CEO Andrew Mason said as recently as June that he wasn't getting married until after the IPO. But the Tribune's business blotter reported Monday that he and Jenny Gillespie have tied the knot.

How much would *you* pay?

The Tribune is reporting that Groupon, one of several thousand companies that strikes deals with vendors, has scaled back its IPO:

The size of the sale, expected to be completed in the next two weeks, could be $500 million to $700 million under plans to be disclosed in advance of the company's roadshow beginning in the next few days, the people said. The size is meant to cut the amount of stock being sold at what may be a knock-down valuation, in hopes that more shares can be sold later at higher prices.

Although valuations of $20 billion to $30 billion were bandied about by outsiders at the time the company filed its plans to go public in May, the current goal of less than half that reflects the reality that the IPO window was closed for nearly two months between mid-August and mid-October because of overall stock-market weakness, and missteps by the company itself.

Amazon, Living Social, Google, and others of Groupon's competitors could not be reached for comment.

Simplified explanation of tzinfo mess

The AP has picked up the story about the tzinfo database moving to ICANN:

The organization in charge of the Internet's address system is taking over a database widely used by computers and websites to keep track of time zones around the world.

The transition to the Internet Corporation for Assigned Names and Numbers, or ICANN, comes a week after the database was abruptly removed from a U.S. government server because of a federal lawsuit claiming copyright infringement.

Without this database and others like it, computers would display Greenwich Mean Time, or the time in London when it isn't on summer time. People would have to manually calculate local time when they schedule meetings or book flights.

Ah, I do love the popular press, trying to explain things. AP writer Anick Jesdanun generally did all right explaining the problem and the move, except the story has no information about the tzinfo community's response to the mess. (I'm just sad they didn't mention The Daily Parker.)

Irrevocable configuration choices

The Daily Parker uses the mostly-open-source dasBlog engine. The software has always offered two choices for how it creates permanent links (permalinks): titles and GUIDs. As you can see, we use GUIDs, so permalinks look like this: http://www.thedailyparker.com/PermaLink,guid,05976d99-b3cb-4391-9052-509832cbf5cf.aspx instead of like this: http://www.thedailyparker.com/About-This-Blog.

I've been thinking that GUIDs, while always unique, are kind of ugly. This morning I tried changing the blog's configuration settings to use titles instead. Sadly, dasBlog generates permalinks on the fly, but doesn't change permalinks within entries.

Therefore, in order to switch to title-based permalinks, I'd need to root around in all of the individual entries and change them. I could write a script to do this, I suppose, but with 2,715 entries spanning almost six years, it's still an undertaking.

So GUIDs will stay, as they have for the life of the blog. If I ever start another blog, or if I ever want to spend a day making the switch for this one, I'll use titles.