Ten years since Jamestown

Ten years ago, I was sitting in my apartment on a rainy Tuesday afternoon. My last class of the day ended around 1:30 and I was settled in to get some work done on the forecast game that I ran for the University. WFO Lincoln had issued a few tornado warnings and there were reports of cold air funnels, so my friend Mike Kruze and I decided to spend the afternoon driving around getting rained on. Instead, we saw one of the most photogenic tornadoes ever recorded in Indiana. And then another one. And then a third (though this one is unofficial, since we could not see the ground from our position and no damage was observed in the empty field).

This was only two days after Mike and I had returned from a marathon drive to northern Iowa, where the most we saw was vivid lightning and large hail after sunset. By this point, I had been chasing for a year and a half. Chase attempts evolved from some undergrad doofuses piling in the car and driving around to a fairly mature venture with thoughtful forecasts, data stops, and real efforts to be in position. Of course, as luck would have it, April 20 ended up being somewhat of a doofus day. Mike had a data plan on his Sprint PCS phone, and it was just enough for us to pull up the occasional radar image. Without that, we’d never have found ourselves standing in rural Boone County with a tornado directly to our west.

At the time of the Jamestown tornado, FunnelFiasco.com wasn’t even a gleam in my eye. I was planning on making a career in the National Weather Service. I figured chasing would be a thing I did with regularity. Ten years on, I’ve earned my meteorology degree, but I’ve never worked as a professional in the field. I have one chase day in the last five years, and I’m less than a year from a decade-long tornado drought. I’ve still only chased west of the Mississippi twice. With a toddler and home and another baby due early summer, I’m not likely to get out this year. But I still feel the gentle tug of the storm, pulling me to go out and seek it. I know I will at some point. I just don’t know when.

The right way to do release notes

Forever ago (in Internet time), the developer(s?) of Pocket Casts released an update with some really humorous release notes:

Release notes for Pocket Casts 3.6.

As I do, I got thinking about how I felt about it. While my initial reaction was to be amused, I quickly turned to finding it unhelpful. In fact, most apps have awful release notes. My least favorite phrase, which seems to appear in the release notes of every updated app on my phone, is “and bug fixes.”

Despite the title of this post, there’s no one right way to write release notes. The “right” way depends on what you’re releasing, for one. In a Linux distribution like Fedora, release notes could be composed of the release notes for every component package. However, that would be monumentally unwieldy. Even the Fedora Technical Notes — which report only the changed packages, not the notes for those packages — is not likely to be ready by too many people. The Release Notes are a condensed view, which highlight prominent features. The Release Announcement is even further condensed, and is useful for media and public announcements. This hierarchy is a good example of the importance of the audience.

I’ve seen arguments that release notes are unnecessary if the source code repository is accessible. Who needs release notes when you can just look at the commit log? This is a pretty lousy argument. A single change may be composed of many commits and a single commit may represent multiple changes (though it shouldn’t). Not to mention that commit messages are often poorly written. I’ve made far too many of those myself. Even if the commit log is a beautiful representation of what happened, it’s a lot to ask a consumer of your software to scour every commit since the last release.

My preference for release notes includes, in no particular order, a list of new features, bugs fixed, and known issues. The HTCondor team does a particularly good job in that regard. One thing I’d add to their release notes is an explicit listing of the subsystem(s) affected for each point. The exact format doesn’t particularly matter. All I’m looking for is an explanation as to why I should or should not care about a particular release. And “fixed some bugs” doesn’t tell me that.

Bug trackers and service desks

I have recently been evaluating options for our customer support work. For years, my company has used a bug tracker to handle both bugs and support requests. It has worked, mostly well enough, but there are some definite shortcomings. I can’t say that I’m an expert on all the offerings in the two spaces, but I’ve used quite a few over the years. My only conclusion is that there is no single product that does both well.

Much of the basic functionality is the same, but it’s the differenences that are key. A truly excellent service desk system is aware of customer SLAs. Support tickets shouldn’t languish untouched for months or even years. But it’s perfectly normal for minor bugs to live indefinitely, especially if the “bug” is actually a planned enhancement. Service desks should present customers with a self-service portal, if only to see the current status of their tickets. Unfortunately, most bug trackers present too much information for a non-technical user (can you imagine having your CEO using Bugzilla to manage tickets?). While this interface is great for managing bugs, it’s pretty lousy otherwise.

Of course, because they’re similar in many respects, the ideal solution has your service desk and your bug tracker interacting smoothly. Sometimes support requests are the result of a bug. Having a way to tie them together us very beneficial. How will your service desk agents know to follow up with the customer unless the bug tracker updates affected cases when a bug is resolved? How will your developers get the information they need if the service desk can’t update the bug tracker?

Many organizations, especially small businesses and non-profits, will probably use one or the other. Development-oriented organizations will lean toward bug trackers and others will favor service desk tools. In either case, they’ll make do with the limitations for the use they didn’t favor. Still, it behooves IT leadership to consider separate-but-interconnected solutions I’m order to achieve the maximum benefit.

The life of a project

A picture on Twitter caught my eye earlier tonight. “The life of a project” captures a person’s mental state through the life of the project. It was particularly meaningful to me because it basically describes what I’ve gone through this past week at work. I’ve been working on a very high stakes proof of concept for a customer with a lot to lose, and it’s been pretty taxing. You can ask my family; I haven’t been much fun to be around the past few days. Fortunately, we’re on the upward slope now: lessons have been learned for phase 2 and the prototype cluster is basically working at this point.

It occurs to me that many projects I’ve been involved with or aware of follow this general pattern. I stopped to think about why that is. For this project, the business requirements weren’t incomplete, although that is often an issue for projects. The technical requirements were missing, though. As it turns out, a few extra DLLs (yes, this is a Windows execution environment) were needed. The software used for this project is pretty awful, and so the error messages weren’t particularly helpful. It took the better part of a day just to leap that particular hurdle.

Another contributing factor was not knowing where we were starting from. (As the old saying goes, “I don’t know where we are, but we’re making good time!”) I thought we were basically cloning a similar project we had done for another customer, and started my work based on that assumption. As it turns out, the environment we were starting from was not nearly as robust and automated as I had been led to believe. The result was two days of effort getting to where I thought we would be after a couple of hours. This wasn’t entirely bad. It gave me an opportunity to work with parts of our product stack that I haven’t worked with much, and it pointed out a lot of areas where future such projects could be done much better. But it was also a great way to add to the already high levels of stress.

Even in a small company, nobody is familiar with everything. Time spent re-explaining what I was trying to do when I had someone new helping me didn’t move the project forward. Likewise, having to hop from person to person in order to get the right expertise was a barrier to progress. I was fortunate that all of my coworkers were extremely willing to help, and I certainly can’t blame them for not knowing everything. Nonetheless, it was frustrating at times.

Certainly the long downward slide that characterizes most of the life of a project can be mitigated. Starting with a more complete understanding of requirements, knowing what you’re starting from, and having the right expertise available can reduce the surprises, frustration, and sadness of a project. Still, I think it’s human nature to experience this to some degree. We always start out confident in our abilities and blissfully ignorant of the traps that are waiting for us.

Amazon VPC: A great gotcha

If you’re not familiar with the Amazon Web Services offerings, one feature is the Virtual Private Cloud (VPC). VPC is effectively a way of walling yourself off from all or part of the world. If you’re running a public-facing web server, it might not be so important. If you’re running a compute cluster, it’s a no-brainer. Just be careful about that “no-brainer” part.

While working on a new cluster for a customer today, I was trying to figure out why the HTCondor scheduler wasn’t showing up to the collector. The daemons were all running. HTCondor security policies weren’t getting in the way. I could use condor_config_val from each host to query the other host. I brought in a colleague to double-check me. He couldn’t figure it out either.

After beating our heads against the wall for a while, and finding absolutely nothing helpful in the logs, I noticed one tiny detail in the logs. The schedd kept saying it was updating the collector, but the collector never seemed to notice. The schedd kept saying it was updating the collector via UDP. How many times had I watched that line go by?

The last time, though, it clicked. And it clicked hard. I had set up a security group to allow all traffic within the VPC. Except I had set it for all TCP traffic, so the UDP packets were being silently dropped. As UDP packets are wont to do. When I changed the security group rule from TCP to all protocols, the scheduler magically appeared in the pool.

Once again, the moral of the story is: don’t be stupid.