systemd and SysV init scripts

Chris Siebenmann wrote earlier this week about how systemd’s support of System V init scripts results in unexpected and undesired behavior. Some init scripts include dependency information, which is an LSB standard that SysV init ignores. The end result is that scripts which have incomplete dependencies specified end up getting started too soon by systemd.

I commented that it’s unreasonable to hold systemd responsible for what is effectively a bug in the init script. If you’re going to provide information, provide complete information or don’t expect things to work right. Chris reasonably replied that many of the scripts that include such information are the result of what he called programming through “superstition and mythology.” Others may use the term “cargo cult programming.” Code reuse has both positive and negative aspects, and the slow spread of bad practices via copy/paste is clearly a negative in this case.

I understand that, and Chris makes a valid point. It’s neither realistic nor reasonable to expect everyone to study the specifications for everything they come across. Init scripts, due to their deceptive simplicity, are excellent candidates for “I’ll just copy what worked for me (or someone else), without checking to see if I’m doing something wrong.” But in my opinion, that doesn’t absolve the person who wrote the script from their responsibility if it breaks down the road.

To the user, of course, who is responsible is immaterial. I wholeheartedly agree that breaking things is bad, but avoiding the breakage needs to be the responsibility of the right people. It’s not reasonable for the systemd developers to test every init script out there in every possible combination in order to hit the condition Chris described.

As I see it, there were three options the systemd developers could have taken:

  1. No support for SysV init scripts
  2. Ignore dependency information in SysV init scripts
  3. Use the dependency information in SysV init scripts

Option 1 is clearly a non-starter. systemd adoption would probably never have occurred outside of a few niche cases (e.g. singe-purpose appliances) without supporting init scripts. The more vehement systemd detractors would prefer this option, but it would be self-defeating for the developers to choose it.

Option 2 is what Chris would have preferred. He correctly notes that it would have kept the init scripts with incomplete dependencies from starting too soon.

Option 3 is what the developers chose to implement. While it does, in effect, change the behavior of some init scripts in some conditions, it also allows systemd to properly order SysV init scripts with services defined by .service files.

The developers clearly considered the effects of respecting dependency information in init scripts and decided that the ability to order services and build a better dependency graph was more important than not allowing certain init scripts from starting too soon under certain conditions. Was that the right decision?

Chris and others think it is not. The developers thing it is. I’m a little on the fence to the extent that I don’t know which I’d choose were the decision up to me. If we had a real sense of how many SysV init scripts end up hitting this condition, that would help inform the decision. However, the developers chose option 3, and it’s hard for me to argue against that. Yes, it’s a change in behavior, and perhaps it’s “robot behavior”, but I have a hard time getting too mad at a computer for doing what I told it to do.

Unlimited vacation policies, burnout, etc.

Recently, my company switched from a traditional vacation model to a minimum vacation model. If you’re unfamiliar with the term, it’s essentially the unlimited vacation model practiced by Netflix and others, with the additional requirement of taking a defined minimum of time off each year. It’s been a bit of an adjustment for me, since I’m used to the traditional model. Vacation was something to be carefully rationed (although at my previous employer, I tended to over-ration). Now it’s simply a matter of making sure my work is getting done and that there’s someone to cover for me when I’m out.

I’m writing this at 41,000 feet on my way to present at a conference [ed note: it is being published the day after it was written]. I’m secretly glad that the WiFi apparently does not work over the open ocean (I presume due to political/regulatory reasons). Now, don’t get me wrong, one of my favorite things to do when I fly is to watch myself on FlightAware, but in this case it’s a blessing to be disconnected. If a WiFi connection were available, it would be much harder to avoid checking my work email.

It took me a year and a half at my job before I convinced myself to turn off email sync after hours. Even though I rarely worked on emails that came in after hours, I felt like it was important that I know what was going on. After several weekends of work due to various projects, I’d had enough. The mental strain became too much. At first, I’d still manually check my mail a time of two, but now I don’t even do that much.

This is due in part to the fact that the main project that was keeping me busy has had most of the kinks worked out and is working pretty well. It also helps that there’s another vendor managing the operations, so I only get brought in when there’s an issue with software we support. Still, there are several customers where I’m the main point of contact, and the idea of being away for a week fills me with a sense of “oh god, what will I come back to on Monday?”

i’ve written before about burnout, but I thought it might be time to revisit the topic. When I wrote previously, I was outgrowing my first professional role. In the years since, burnout has taken a new form for me. Since I wrote the last post, two kids have come into my life. In addition, I’ve gone from a slow-paced academic environment to a small private sector company which claims several Fortune 100 companies as clients. Life is different now, and my perception of burnout has changed.

I don’t necessarily mind working long hours on interesting problems. There are still days when it’s hard to put my pencil down and go home (metaphorically, since I work from a spare bedroom in our house). But now that I have have kids, I’ve come to realize that when I used to feel burnt out, I was really feeling bored. Burnout is more represented by the impact on my family life.

I know I need to take time off, even if it’s just to sit around the house with my family. It’s just hard to do knowing that I’m the first — and sometimes last — line of support. But I’m adjusting (slowly), and I’m part of a great team, so that helps. Maybe one of these days, I’ll be able to check my email at the beginning of the work day without bracing myself.

Lessons and perspective from fast food

I started working in fast food when I turned 16 and needed to fund my habit of aimlessly driving around the backroads. I ended up spending five and a half years in the industry (the last few of which were only when I was home from college), advancing from a meat patty thermodynamic technician to crew trainer to floor supervisor. In the end, I was effectively a shift manager. I supervised the store’s training and safety programs.

It’s should be no surprise that I learned a lot about myself and about leadership during that time. Even a decade on, stories from my fast food days come up in job interviews and similar conversations. As a result of my time in fast food, I’ve learned to be very patient and forgiving with those who work in the service sector. I’ve also learned some lessons that are more generally applicable.

Problems are often not the fault of the person you’re talking to. This is a lesson for service customers more than service providers, but most providers are customers to. In fast food, when your order is wrong it’s sometimes the fault of the counter staff putting the wrong sandwich in the bag, but sometimes it’s the fault of the grill staff for making it wrong. In any case, it’s rarely the result of malice or even incompetence. People, especially overworked and under-engaged people, will make mistakes. This holds true for customer service staff at most companies, particularly large ones where the tier 1 staff are temps or outsourced. (That doesn’t imply that actual malice or incompetence should be acceptable.)

Sometimes the best way to deal with a  poor performer is to give them responsibility. One story I’m fond of telling involves a crew member who was, in many ways, the stereotypical teenage fast food worker. He was smart, but lazy, and didn’t much care for authority. The fact that I was only a year older than him made it particularly hard for me to give him orders. He’d been working for a few months and was good at it when he applied himself, so the trick was to get him to do that. After a little bit of fumbling around, I found the trick. I started spending more time away from the grill area and more time up front, and I made it clear that he was in charge of the grill. I gave him some autonomy, but I also held him accountable. Lo and behold, his behavior improved. He also started taking opportunities to teach new employees the tricks of the trade. He could have let the authority go to his head, but instead he acted like an adult because he was being treated like an adult.

Don’t nitpick behavior, but don’t put up with crap. The standing philosophy on my shifts was “don’t make life hard for me and I won’t make life hard for you.” I didn’t like bossing people around (in part because they were either old enough to be my grandmother or because they were 16 and all the bossing in the world wasn’t going to have any effect). If people were doing their jobs well, I put up with some good-natured shenanigans (the exceptions being food safety, physical safety, and customer satisfaction). One time a fellow supervisor had written up a crew member for a really trivial infraction. I went to her file later and tore it up. This person was a good worker and harping on her for minor issues was only going to drive her away. By the same token, I came in from taking the garbage out one night to find two normally good workers having a fight with the sink sprayer. They were both sent home right away (granted one of them was off in 10 minutes anyway).

Spread the unpleasant work around, and be willing to do some yourself. Some of the managers and supervisors liked to make the same people do the unpleasant tasks all the time. I hated that approach because it just seemed to reinforce bad behavior. The people who made my life unpleasant certainly came up in the rotation more often, but everyone had to clean restrooms, empty grease traps, etc. Including me. I didn’t try to make a show of it, but I wanted the people who I worked with to see that I wasn’t using my position to avoid doing things I didn’t want to do. And if I’m being completely honest, there was something I enjoyed about emptying the grease traps (except when I’d lose my grip and pour them into my shoe).

Don’t be a jerk. I won’t delude myself into thinking I was universally loved. I know I wasn’t, and I’m okay with that. But for the most part, I think I was pretty well-liked among the crew and management because I wasn’t a jerk. I took my job seriously (perhaps too seriously at times), but I was flexible enough to try to work with people the way they needed to be worked with. I tried to make work fun and cooperative, because working in fast food sucks and anything that can make it less sucky benefits workers and customers alike.

 

Coach of the Year?

On Monday night, the Big Ten conference announced the annual postseason honors for men’s basketball. It may come as no surprise that the Coach of the Year winners finished first (Bo Ryan in the coaches’ voting) and second (Mark Turgeon in the media voting) in the conference. Winning basketball games is a good way to get recognized for your coaching prowess. Some Purdue fans were upset that Matt Painter did not get recognized in light of the turnaround that his Boilermakers showed from the end of non-conference play.

After all, this is a team that finished last in the conference last year and was projected to finish toward the bottom again this year. Instead, after a string of embarrassing losses in December made the idea of an NCAA tournament bid nearly a pipe dream, the team got it together and finished in a tie for third place. Surely that is a sign of an excellent coaching job, right?

It is, but there’s a catch. Sure, Painter’s team exceeded expectations, but the expectations became low on his watch. It’s not like Painter inherited a depleted roster this year, as he did when he took over for Gene Keady a decade ago. Since walking into a 9-win season his first year, he had several teams that competed for a regular season title, one Big Ten Tournament winner, and two Sweet 16 appearances. The momentum was there, and for whatever reason (I’m inclined to say recruiting the wrong players for his system, among other reasons), the team slipped. Recruiting is part of the job, so why not reward a coach for having a roster talented enough to win the conference?

I understand that reasoning, but I’m not sure I agree with it. After all, the award is Coach of the Year, not Coach of the Years. The fact that Matt Painter wasn’t doing his job well enough for a few years shouldn’t handicap him now. But they don’t give me a ballot, and I can’t help but think those who reflexively vote for the top-performing teams regardless of expectations make a reasonable argument. After all, there are some coaches who can make a lot of the talent they have (Tim Miles last year), there are some coaches who can bring in talent but underperform (Tom Crean), but the successful coaches in the Big Ten are the ones who can get talented players and make the most of them (Ryan, Tom Izzo, Thad Matta). Matt Painter has shown the ability to do both, but not necessarily at the same time. If he can get both parts of the coaching duties in line, he’ll have plenty of awards to put on his trophy shelf.

Hillary Clinton’s email

Ed. note: I generally avoid politics on Blog Fiasco, and this post is about the technical angle more than any political implications. The comments section will be moderated accordingly.

It seems that when Hillary Clinton was secretary of state, she used a personal email address instead of a .gov. While this was apparently legal at the time she held that office, it’s not good. For one, using the same account for both personal and business email is a bad idea (full disclosure: I used to do that. More on that later.) If email records for one need to be obtained (e.g. as evidence in a court action), messages for the other will at least be examined and may find themselves caught up in the same net. It also makes it harder to stop being at work when you’re not at work. (Of course, high-ranking government officials rarely get such opportunity.)

For public office holders, there’s an additional dimension. Public records are an important part of our history and our democratic and legal processes. When records such as emails are outside the purview of the appropriate custodians, it’s much easier for them to become intentionally or accidentally lost.

Again returning to the general case, every such incident is a  indictment of the services the business offers. My first foray into using an external email service was in my first job. I had a lot of automated and non-automated email and the University’s mail system didn’t offer me much room. I created a dedicated GMail account to serve as a storage and search facility for old messages.

In a later role, I had roughly 50x as many machines, plus commit messages from our Subversion server, Nagios alerts, and the like. So, like many of my colleagues, I forwarded my work email to my personal GMail account in order to take advantage of the powerful filtering.

I didn’t have to deal with sensitive information as part of my daily work, but others on campus did. There were frequent discussions in various circles expressing concern about allowing users to forward email to arbitrary, potentially poorly-maintained mail servers (this concern never went so far as to actually disable the ability to set a forwarding address).

In most cases, people who forwarded their email off campus were doing so to overcome a perceived pain of using the campus systems. I don’t know why Hillary Clinton used a private email address, but I’m sure it was due to some perceived shortcoming  (technical, usability, or political) in what the State Department made available. When we discover shadow IT systems, the best response is to figure out what drove the user away from our service in the first place.

Being a good troubleshooter

Not too long ago, my friend Andy said “I think that’s how i convince so many people I’m decent at being a sysadmin, I just look at logs.” It was a statement that really struck a chord with me. On the one hand, I feel like I’ve had a pretty successful career so far, and if I haven’t been excellent, I’ve at least been sufficiently competent. On the other hand, I don’t feel like I have a lot of the experience that “everyone” has. I’ve never run email servers, I’ve only done trivial DNS and DHCP setups, and so on.

In my current job, I work with some really smart people, few of whom have much if any sysadmin experience. Andy and I, because of our sysadmin background, have become the go-to guys for sysadmin questions. It’s a role I enjoy, particularly when I’m able to solve a problem. Sometimes it’s because I have direct experience with the issue, but more often than not, it’s because I know how to poke around until I find the answer.

There are many skills required for successful systems administration. Troubleshooting is high on the list. The ability to troubleshoot new and unusual problems is particularly helpful. I’ve found my own troubleshooting abilities to depend on the ability to build off of previous (if unrelated) experience and being able to piece together disjointed clues. It’s sort of like being a detective and also a human “big data” system.

Fishing for clues in log files is great, too, if you can find the needle in the haystack. But what seems to separate the good troubleshooters from the bad that I’ve known is intellectual curiosity. Not just asking what happened, but finding out why. Being willing to ask questions and learn about unfamiliar areas instead of immediately deferring to those more knowledgeable builds a strong skill base.

Of course, without Google, we’d all be out of luck.

Oh no, I can’t scan!

Printing is evil. Less evil but related is scanning. I don’t have to scan documents very often, but I needed to a few weeks ago. I fired up XSane and it immediately quit because it couldn’t find any scanners. I have an all-in-one, and the printing worked, so I felt like it should probably be able to find the scanner. Running `lsusb` showed the device.

“Maybe it’s a problem with XSane”, I said to myself. I tried the `scanimage` command. Still no luck. Then I ran scanimage as root. Suddenly, I could scan. My device was on /dev/bus/usb/001/008, so I took a look at the permissions. Apparently the device was owned by root:lp, and the permissions were 664. I wasn’t in the lp group, so I couldn’t use the scanner. So I added myself to the group.

I’m assuming something changed when I upgraded to Fedora 21, since it had worked previously. Perhaps a udev rule is different? In any case, if you happen upon the same problem, enjoy this shortcut answer so that you don’t spend an hour trying to debug it.

Simplifying winter weather products

Decades after the National Weather Service began issuing watches and warnings, many members of the public don’t know what the difference is. When you throw in different products, the confusion only mounts. Too often, the products are based on meteorological distinctions that don’t necessarily mean much to the public. Take, for example, a nor’easter that struck New England in December. Or the confusion around the landfall of Sandy, which became extratropical shortly before landfall.

Some products you might see in a winter event include blizzard, winter storm, high wind, wind chill, ice storm, lake effect snow, and freezing rain. Plus flood products and special weather statements. How should the public try to understand these differences?

In general, I’m a proponent of getting the important information to the consumer as quickly as possible with minimal effort required. This case is an exception. Trying to cram the important information into the headline leads to public confusion and forces forecasters to spend time trying to decide which of a handful of products are correct instead of focusing on communicating impact.

I’m in favor of a smaller set of products, with specific impacts delineated in the text. A “winter storm” and “blizzard” product with watch, warning, and advisory (maybe) levels would go along way toward making the products more clear to the public. Everyone could spend less time thinking about the differences between the products and more time focusing on the impacts and preparedness.

If you’re interested in the official specification for the current suite of winter products, see http://www.nws.noaa.gov/directives/sym/pd01005013curr.pdf

Self-promotion is hard

I’ve never really written this blog with the intent of gaining a massive audience. Mostly, I don’t even notice if people read it or not. As much as anything, I write it as a way to document things for my own purpose or to express opinions that could never fit in a tweet. This is made plainly obvious by the fact that I almost never promote my posts. It’s fairly rare anymore that I even share them on my own social media feeds. I certainly don’t submit them to sites like Reddit.

This is partly because it feels like blog spam, and partly because I generally don’t think very highly of the posts I write. It’s way more rewarding when someone else sees a post and says “hey, that’s worth sharing.” Like my post about the PR blunder made by the elementary team. My friend Andy thought it was well-written and submitted it to /r/Linux. A little while later, it was at the top of that subreddit. Holy crap, I’m a real thought leader now!

A screen capture showing one of my posts at the top of the Linux subreddit.

Blog Fiasco at the top of /r/Linux

As of this writing, Andy has gained 622 link karma from the post. Karma that could have been mine, but I couldn’t bring myself to post my own blog. (Of course, I pretty rarely submit links anyway. My second-most karmatic submit is a Blog Fiasco post from three years ago. That’s the last time I self-submitted.)

Sometimes I think this pseudo-humility holds me back. I could probably gain more recognition in the field and achieve my dream of being a preeminent thought leader if I would just be more willing to force my writing on people. But that seems like a jerk move. So for the time being, I’ll just sit here and write in relative obscurity. If I get noticed and made into a star, that’s great. If not, well at least I’ll always have post 1652.

What’s in a (version) number?

Last week, Linus Torvalds posted a poll on Google+ asking people if the Linux kernel should continue with 3.x release numbers or if it was time for 4.0. I seem to recall Linus saying something to the effect of “it’s just a number” when Linux finally went from 2.6 to 3.0. Of course, it’s not just a number. Versions convey some kind of meaning, although the meaning isn’t always clear.

In general, I’m a fan of following the Semantic Versioning specification. In the projects I work on, both personally and professionally, I use something close to Semantic Versioning. I actually commented to coworkers the other day that I thought we had really screwed up on some product version numbers for not incrementing the major version when we made breaking changes.

But that’s the limitation of Semantic Versioning, too. Some projects do an excellent job of maintaining compatibility (for example, you can use some pretty old HTCondor versions with the new hotness and things generally work). Do they stick with the same major version forever and end up with very large minor versions? At some point, then, the major version becomes pretty useless.

A colleague remarked that in some cases, the patch number has become almost irrelevant. Back when even the smallest of releases meant mailing flopping disks or waiting an eternity for an FTP download, even patch releases where a big deal. In the modern era of broadband and web-based applications, version numbers themselves start to mean a lot less. If your browser auto-updates, do you even care what the version number is? Does anyone (including Facebook developers) know what version of Facebook you’re using?

Of course, not everything is auto-updating and even fewer things are web or mobile applications. Version numbers will be important in certain areas for the foreseeable future. It’s clear that no versioning scheme will be universally applicable. What’s important is that developers have a versioning scheme, that it’s made clear to users and downstream developers, and that they stick to it. That way, the version numbers mean something.