The Daily Parker

Politics, Weather, Photography, and the Dog

When software bugs kill

The Daily WTF today takes us back to one of the worst software bugs in history, in terms of human lives ruined or lost:

The ETCC incident was not the first, and sadly was not the last malfunction of the Therac-25 system. Between June 1985 and July 1987, there were six accidents involving the Therac-25, manufactured by Atomic Energy Canada Limited (AECL). Each was a severe radiation overdose, which resulted in serious injuries, maimings, and deaths.

As the first incidents started to appear, no one was entirely certain what was happening. Radiation poisoning is hard to diagnose, especially if you don't expect it. As with the ETCC incident, the machine reported an underdose despite overdosing the patient. Hospital physicists even contacted AECL when they suspected an overdose, only to be told such a thing was impossible.

With AECL's continued failure to explain how to test their device, it should be clear that the problem was a systemic one. It doesn't matter how good your software developer is; software quality doesn't appear because you have good developers. It's the end result of a process, and that process informs both your software development practices, but also your testing. Your management. Even your sales and servicing.

While the incidents at the ETCC finally drove changes, they weren't the first incidents. Hospital physicists had already reported problems to AECL. At least one patient had already initiated a lawsuit. But that information didn't propagate through the organization; no one put those pieces together to recognize that the device was faulty.

On this site, we joke a lot at the expense of the Paula Beans and Roys of this world. But no matter how incompetent, no matter how reckless, no matter how ignorant the antagonist of a TDWTF article may be, they're part of a system, and that system put them in that position.

TDWTF's write-up includes a link to a far more thorough report. It's horrifying.

How lazy usability can make your day harder

This morning I posted about some frustrations in getting our CRM system to import donations from our fundraising events so that we can then match donations with addresses to send out end-of-year tax letters. The frustrations have grown to the point where naming names seems appropriate, if only because Neon One, the CRM company, has a web-based ticketing system that doesn't really handle the level of detail their developers will need to (a) understand the problem, (b) understand the frustration, and (c) understand the features needed to solve (a) and (b).

As you read this, keep in the back of your head that I'm a software developer with 25 years of professional experience and another 15 of hobby experience before that. In other words, I've been writing software longer than almost everyone at the CRM developer has been alive.

Neon will probably consider this a feature request, though on any of the product teams I've run in the past 15 years, this would be a usability bug report.

tl;dr: Neon's import feature is software-centric, not user-centric. Instead of the feature helping the user, it expects the user to help the software. This causes grief for any user who is not a piece of software.

The simple problem

Because Neon doesn't have adequate (or, it seems, any) support for silent auctions and other day-of-event realities, we use a different system for our fundraising events. The other system spits out perfectly reasonable Excel documents with all the information we need to track donations along with the fair-market valuations of silent auction winnings. We want to import that information into Neon so that we (a) can print out end-of-year tax letters and (b) accurately track giving in the long term.

The obvious path, which doesn't work

People have studied usability for almost as long as I've worked in software professionally. Jakob Nielsen has written about since 1998. The first principle of usability has never changed: make the obvious path work in an obvious way.

Neon has an import function that appears, at first glance, to import exactly the kind of data our event system produces. It seems like one should be able to import a flat file containing the donor name, address, email, and phone number; the date and amount of the donation; maybe a note or other optional information. You would think you could map the columns on the file to fields in Neon, and the software would read the data file and import the data. Maybe you'll get one or two spurious donor records when the information in the file doesn't exactly match an existing record's data, so maybe you'll have a few minutes of clean-up that you can do right from the import report after it's done.

Anyway, that's how I'd design it. That's how Jakob Nielsen would design it. That's how the 22-year-old newbie with a still-wet bachelor's degree in design would do it.

That's not how Neon designed it.

OK, so there may be an extra step

The first thing the Neon import feature asks is: does your data have Neon account ID numbers? If not, it warns you it won't be able to match your data with existing donor records.

Wait, what? The CRM already has a decent "find duplicate records" feature, so why can't this run automatically on new or imported entries? (Seriously, Neon: why do we spend time after every concert de-duping our data because your software doesn't think to match existing records with new ticket orders even when all of the data are the same in both records? More on this in a moment.)

All right. I'll spend 15 minutes adding Neon IDs to all the donation records in the exported file that correspond to existing donor records. It's an extra step that the software should be able to accomplish on its own, but whatever, no one likes writing record-matching code.

Now what I expect is that Neon will add the new donations to the existing records, and add new account records if I don't supply an ID.

Nope. My first pass through this process looks only at the exported records whose IDs I've provided and updates those donor records, while completely ignoring the new records. And it then takes me to a screen that looks suspiciously like it will make a total hash of the donation records in the same export file if I click "next." So I abort.

Maybe a few extra steps?

I think, perhaps I should give Neon a little extra help now. Let me scrub the data going into the import so that we have the best chance of good data getting into the CRM. So I separate the export into two files, one containing Neon IDs and one with all the new donors who came to the event. Then I go through both to make sure mailing addresses conform to USPS standards, phone numbers are uniformly 10 digits without separators, and email addresses are validly formed.

Next, I import the file that does not have Neon IDs, hoping it will create new records for me. It does! And it even exports a report containing the new IDs it created, albeit with only the donor's full name and not the donor's first and last names, which creates a bit of extra work as now I have to manually map them to my donor export.

So after I add the new IDs where needed in my donation export file, I'm ready to import the donations. I start the donation wizard, it accepts that my records have valid Neon IDs, and I map the donation date and amount columns to Neon fields. And then it tells me that I don't have a "donation type" mapped.

I'm not going to go through the steps required to figure out what a valid "donation type" is. Suffice to say, I add a column to the export called "Donation Type" and fill every row with the value "Donation". (Why can't we just specify constant values for required fields at import?)

I go through the import wizard again, map everything required, and hit import. Oof! Error! Apparently, dates are hard. The export file has donation dates in the format "Jul 17, 2020 7:27:19 PM", but Neon says "Time field must be a date;now it supports 'MM/DD/YYYY','MM-DD-YYYY' and 'MMDDYYYY' format."

Now, without going into the rabbit hole too deeply, a few things immediately occur to me as someone who has successfully parsed dates for a quarter century in a half-dozen programming languages. First, why the fuck does Neon only accept those three formats when the format presented is unambiguous and can be parsed by nearly every programming language out there? Second, none of the formats presented or accepted in this case conform to ISO-8601, the international standard for date and time representation, so everyone is on shaky ground here. Third, I got all the way to this point and now it tells me I have to go all the way back and change the date format by hand? Because it turns out, Excel can't parse the dates either. Good work, 3rd-party event package. Nicely done.

Once more unto the breach, dear friends, once more;
Or jam the import with our ISO dates.

After entering all the dates by hand (made easier by 90% of them being the date of the event), I re-exported the file to .csv and tried the import again. (Oh, yes, Neon can't read XLSX files, so I have to export to CSV every. Single. Time.)

Success! Finally!

And now I get to do it again with the silent-auction winners list. All of it. Again. This time I just entered the 9 new accounts by hand, and discovered 7 duplicate accounts that had to be merged, and thanked the universe we only had 60 silent-auction items.

So I go to import and...Goddammit I forgot the "donation type" field.

So I go to import again. And it mostly worked. Except even though I specified which field contained the fair-market value to map against the donation, Neon ignored that. It appears nowhere in the individual donation records. Why even present the field as an option if...I mean...what the hell, Neon? Yet more shit I have to map by hand later on.

But wait, it still doesn't work

After all that effort, I did some spot-checks on various accounts and found that even though none of the donation records shows fair-market value for auction items, at least all of the donations that I expect to see in each record appear in each record. So now it's time for the 2020 donor report, and...

Um...you're fucking kidding me.

None of the imported donations shows up in the donor report's "2020 Donation Amount" field. After checking a few donor records, I promote my hypothesis (that Neon groups by the record-creation date for the annual sum instead of the actual donation date) to a theory. Neon support case #00316491 is born.

How Neon should have coded this

Many, many developers have solved this problem before. Importing donations should require only one pass through the Import feature, and follow this heuristic:

  1. As soon as the user selects "import donations" from whatever UI control offers the choice, Neon presents a page of simple documentation explaining the process, listing the required fields, and offering some insight into how it will work.
  2. The user selects the import file, which can be CSV, Excel, or any other common delimited format.
  3. The user maps all of the relevant columns from the import file to Neon fields. Neon does not present the user with fields that it will ignore for no stated reason later on.
  4. The user provides constant values (from drop-downs if necessary) for required fields that do not appear in the import file.
  5. The user clicks "upload."
  6. For each record:
    1. If the imported donor account has an ID that matches an existing account by ID, use that donor account.
    2. If the imported donor account matches the name and email of an existing account (or some other criteria), use that donor account but flag the row for review.
    3. If the imported donor account does not match an existing account on either criteria, create a new donor account.
  7. Stage the donation record with the specified, found, or new account ID, including all of the fields that the user mapped in step 3.
  8. Present the list of "for review" items to the user before committing the import, allowing the user to make edits to the imported data, or move back a step in the process.
  9. Once the user is satisfied, the user clicks "commit" to write the data to Neon.

I get it: Neon's dev group have process problems

I mean, guys, this really isn't that hard. You want hard? Write an IBM360 to MS-DOS import, complete with different endian values, and where mapping has to be hard-coded because configuration files haven't been invented yet.

I know how software development works. I expect any Neon folks reading this may think this is unfair, that the devs told management the features weren't finished, that management told leadership the devs weren't 100% finished but the stuff works well enough, and that leadership looked at the growing list of must-have features for the brochure and forgot to finish the must-have features for the users, that "it's not my fault." I'd also bet you a dollar that any dev reading this will think "I told you so" (unless they think "it works on my machine, you DFU," in which case you have other problems.)

In other words, you guys have a process problem. Somewhere the definition of "done" that passed QA for the import features didn't match the definition of "done" that users need.

For this we're paying $3600 a year. NB: we're willing to pay a lot more for software that works the way we need it to.

Meanwhile, though, I'll have to hand-correct what all this automation should have given me already, and get the damn tax letters out.

And Neon CRM support incident #00316497 is born.

So much not fun about this

I'm president of the Apollo Chorus of Chicago. One of my jobs is to send out letters to all of our donors acknowledging their donations for the previous calendar year. These letters should have gone out by January 31st, but...well...OK, I'm a little delinquent. And for no other reason than I really, really did not want to merge all the data by hand.

You see, we use a smallish CRM system for all of our institutional data, which works pretty well, especially with our membership and tech-savvy donors. Many people have set up recurring donations through the CRM portal (please donate!), which happily bills their credit cards each month and creates new records for us to merge later. (It's really, really dumb about preventing duplicate records, but its merge feature works well enough.) Other donors make ad hoc contributions through the CRM as well. The CRM then sends a report to our treasurer which he imports into QuickBooks, and all is good.

You know there's a "but." See, we use a different system for managing our annual fundraiser, because our CRM sucks at event management. This system records all of the donations received for each event, plus silent auction winnings, and produces its own reports that we import into QuickBooks.

The upshot is that the CRM is not the single point of truth for donations, though it muddles through as the single point of truth for membership and music purchases. The Development Committee therefore doesn't want to use the CRM, even though it's why we have the CRM in the first place. The treasurer doesn't want to enter or reconcile all the event donations by hand either. Nor does our IT Director want to merge all the duplicate records that would result from importing the event data.

I hope to have a solution for this by next year. This year, however, I'm banging my head onto my desk as I try to reconcile QuickBooks' list against the event software's list against what should be the single point of truth for all of it.

Three-pointer

Today is the last day of Sprint 28 at my day job, and I've just closed my third one-point story of the day. When we estimate the difficulty of a story (i.e., a single unit of code that can be deployed when complete), we estimate by points on a Fibonacci scale: 1, 2, 3, 5, 8, 13, 21. A 2-point story is about twice as hard as a 1-point story; a 5 point story is about 5 times harder than a 1-point story; etc. If we estimate 8 or more points on my current team, we re-examine the story in order to break it into smaller chunks. Similarly, a 1-point story could turn out to have so little complexity that it takes almost no time, like today's story #304 that required adding one line of code to here and removing 37 lines of code from there. That one took about 15 minutes. The other two took a couple of hours each, as "knowing where to put the bolt" takes longer than actually attaching the bolt.

While all that happened on the west side of my desk, the monitors on the south side lit up a few stories for me to read when I get back from the walk I'm about to take:

  • Jennifer Rubin lists 50 things that have improved in the US in the past 5 days, starting with "you can ignore Twitter."
  • Though Rubin mentioned replacing Andrew Jackson's portrait in the Oval Office, she didn't mention that the Biden Administration has taken steps to complete replacing his racist mug on the $10 note with a portrait of Harriet Tubman. (The outgoing administration, for obvious reasons, mothballed this plan upon taking office.)
  • Charles Blow warns against the Democratic Party should keep advocating and stop "subconsciously modulating responses" in the face of Republican criticism.
  • National Geographic describes the Roman road network that spanned over 320,000 km and still remains largely intact today.
  • Philippa Snow suggests the French series Call My Agent if you're looking for serious entertainment. For my part I'm about to start Series 2 of Peaky Blinders.
  • Loyola University Chicago professor Devon Price has a new book out: Laziness Does Not Exist. I may have to buy a copy. Eventually.

And I will now try to get in a 45-minute fast walk as our first real winter storm bears down on us from Iowa.

Stuff that seems cool but...

The Consumer Electronics Show went virtual this year, but it still had some interesting toys, like these:

Air Safety Virus Monitors

It's well-known that things like ventilation and humidity affect how well coronavirus spreads indoors. But how do you know how much ventilation is enough? Airthings sensors pair with a smartphone to monitor indoor air quality for temperature, humidity and number of people in the room (it makes a guess based on the amount of carbon dioxide present). If quality dips and virus risk rises, Airthings will suggest opening windows or making other changes. This could be helpful for businesses, such as restaurants, to know if their capacity is too high. Airthings also monitors for more traditional air quality risks like radon and mold.

Or how about:

Balcony Bee-Keeping Box

The pandemic has driven an upswing in gardening and home-canning: why not beekeeping? Italian company Beeing’s B-Box is a small hive that works with a sensor to monitor the bees’ health and environment. It also has a special design that separates the extra honeycomb from the bees, so you can harvest the honey without suiting up like an astronaut. Plus, it’s small enough to keep on even a modest urban balcony.

I don't know how my neighbors would feel about that one, but it seems perfect for the building.

Sure Happy It's Thursday, March 319th...

Lunchtime roundup:

Finally, the authors of The Impostor's Guide, a free ebook aimed at self-taught programmers, has a new series of videos about general computer-science topics that people like me didn't learn programming for fun while getting our history degrees.

The Economist's Bartleby column examines how Covid-19 lockdowns have "caused both good and bad changes of routine."

Everyone who understands security predicted this

Security is hard. Everyone who works in IT knows (or should know) this. We have well-documented security practices covering every part of software applications, from the user interface down to the hardware. Add in actual regulations like Europe's GDPR and California's privacy laws, you have a good blueprint for protecting user data.

Of course, if you actively resist expertise and hate being told what to do by beanie-wearing nerds, you might find yourself reading on Gizmodo how a lone hacker exfiltrated 99% of your data and handed it to the FBI:

In the wake of the violent insurrection at the U.S. Capitol by scores of President Trump’s supporters, a lone researcher began an effort to catalogue the posts of social media users across Parler, a platform founded to provide conservative users a safe haven for uninhibited “free speech” — but which ultimately devolved into a hotbed of far-right conspiracy theories, unchecked racism, and death threats aimed at prominent politicians.

The researcher, who asked to be referred to by their Twitter handle, @donk_enby, began with the goal of archiving every post from January 6, the day of the Capitol riot; what she called a bevy of “very incriminating” evidence.

Operating on little sleep, @donk_enby began the work of archiving all of Parler’s posts, ultimately capturing around 99.9 percent of its content. In a tweet early Sunday, @donk_enby said she was crawling some 1.1 million Parler video URLs. “These are the original, unprocessed, raw files as uploaded to Parler with all associated metadata,” she said. Included in this tranche of data, now more than 56 terabytes in size, @donk_enby confirmed the raw video includes GPS coordinates, which point to the locations of users when the videos were filmed.

Meanwhile, dozens of companies that have donated to the STBXPOTUS and other Republican causes over the past five years have suddenly started singing a different tune:

Ephemeral GPS failure

Sony-made GPS chipsets failed all over the world this weekend when a GPS cheat-sheet of sorts expired:

In general, the pattern of your route is correct, but it may be displaced to one side or the other. However, in many cases by the completion of the workout, it sorts itself out. In other words, it’s mostly a one-time issue.

The issue has to do with the ephemeris data file, also called the EPO file (Extended Prediction Orbit) or Connected Predictive Ephemeris (CPE). Or simply the satellite pre-cache file. That’s the file that’s delivered to your device on a frequent basis (usually every few days). This file is what makes your watch near-instantly find GPS satellites when you go outside. It’s basically a cheat-sheet of where the satellites are for the next few days, or up to a week or so.

I experienced this failure as well. I recorded two walks on my Garmin Venu, one Friday and one yesterday. In both cases, the recorded GPS tracks appeared about 400 m to the west of where I actually walked.

Because the issue started between 22:30 UTC on December 31st and 15:00 UTC on January 1st, I (and others) suspect this may have been bad date handling. Last year not only had 366 days, but also 53 weeks, depending on how the engineers configured the calendar. So what probably happened is that an automatic CPE update failed or appeared to expire because the calendar handling was off.

Dates are hard.

Portable Document Format: still crappy after all these years

Earlier this year, the Nielsen Norman Group repeated a study they first did in 1996 on the usability of PDF documents. As they've now found three times, making PDFs instead of actual web pages yields a horrible experience for users:

Jakob Nielsen first wrote about how PDF files should never be read online in 1996 — only three years after PDFs were invented. Over 20 years later, our research continues to prove that PDFs are just as problematic for users. Despite the evidence, they’re still used far too often to present content online.

PDFs are typically large masses of text and images. The format is intended and optimized for print. It’s inherently inaccessible, unpleasant to read, and cumbersome to navigate online. Neither time nor changes in user behavior have softened our evidence-based stance on this subject. Even 20 years later, PDFs are still unfit for human consumption in the digital space. Do not use PDFs to present digital content that could and should otherwise be a web page.

PDF files are typically converted from documents that were planned for print or created in print-focused software platforms. When creating PDFs in these tools, it’s unlikely that authors will follow proper guidelines for web writing or accessibility. If they knew these, they’d probably just create a web page in the first-place, not a PDF. As a result, users get stuck with a long, noninclusive mass of text and images that takes up many screens, is unusable for finding a quick answer, and boring to read. There’s more work involved in creating a well-written, accessible PDF than simply exporting it straight from a word processing or presentation platform. Factors such as the use of color, contrast, document structure, tags, and much more must be intentionally addressed.

Yah, so, don't use them.

Today is slightly longer than yesterday

The December solstice happened about 8 hours ago, which means we'll have slightly more daylight today than we had yesterday. Today is also the 50th anniversary of Elvis Presley's meeting with Richard Nixon in the White House.

More odd things of note:

Finally, it's very likely you've made out with a drowning victim from the 19th century.