To automate or not to automate?

According to HackTux.com, the eighth sysadmin commandment is

VIII. Thou shalt automate

Do more, faster by automating tasks. You are the puppet master, do not waste your time with menial work. Work smarter, not harder.

It makes sense to automate tasks. Automation reduces the number of opportunities for human error and it makes processes repeatable. These are good things, but like any ideological absolute there are exceptions. One issue I’m currently wrestling with is whether or not the builds of internal documentation at work should be automated. There are compelling arguments for both choices, and it’s not clear to me which one wins out.

Since I’m the one who is leading the documentation charge, I got to pick the technology. My work in the Fedora Documentation group has made me familiar with DocBook XML and Publican, so that’s what I chose. The combination of the two allows for a single source (that can be edited in any editor and neatly managed with your version control system of choice) that can be built into HTML, PDF, or epub. While I wouldn’t call publican “fragile”, it can be a bit verbose in its output. Sometimes the output can be helpful, and sometimes it can be easily ignored, and I’ve not yet come up with an easy way to programmatically distinguish the two. More importantly, the output can give a clear indication that the syntax is invalid (although it’s not always easy to figure out what the precise problem is).

The arguments in favor of automated builds are championed by one of the group’s managers. His concern is that even a well-documented build process may be a deterrent to some group members. Automating the builds removes the opportunity to make mistakes and makes the process simpler. On the other hand, it presents additional moving parts which may break in new and interesting ways.

There are two ways to handle the automated builds. First, every time a document is updated in the version control tree, a post-commit hook builds the new version. This has the ability to provide nearly-instant feedback on whether or not the document is semantically valid and has the correct layout and content. On the other hand, it may now require multiple commits to finally get the document correct. The writer could, of course, run publican locally before committing the changes, but if they’re going to do that, why have an automated process repeat the work? This method also contradicts the “commit early, commit often” philosophy that I embrace.

Instead of triggering builds based on commits, the build can be done on a schedule. This prevents multiple concurrent build attempts due to rapid commits, at the expense of losing quick feedback. In addition, it can muddy the “who broke this doc?” waters. If there are multiple commits between builds, it is potentially more difficult to hunt down the offending error (or even worse, errors). It also absolves writers of responsibility — since the entire group must be notified that the build failed, everyone will assume someone else will fix it. As a result, the document doesn’t get fixed.

I’m sympathetic to the pro-argument builds. On the other hand, the Fedora Docs are manually built and published (granted there are other considerations like release cycles and translation). Having to run and correct builds by hand helps train writers to write valid syntax the first time through and it allows documents to be updated when they’re ready, while still giving writers the ability to commit works in progress. Useful documents are almost never write-once, read-many. Even the U.S. Constitution has had 27 patches accepted. Still, for the kinds of documentation intended (e.g. design documents), once a document is initially written, it shouldn’t change often. This suggests that designing an automated build system is over-engineering, especially considering the current staff shortage.

It would seem that my current inclination is to keep with a manual build process, even if it is heresy. Fortunately, this may be a solved problem in the not-so-distant future. Red Hat’s Joshua Wulf (and others) have been developing a live DocBook editor that can be done entirely in the web browser. The idea is to lower the barrier to entry for Fedora contributors. This is still in the proof-of-concept stage, but the initial results are promising. In the meantime, I’ll work on getting the content seeded. If someone else in the group wants to automate it, I won’t stand in the way.

4 thoughts on “To automate or not to automate?

  1. Good topic, and definitely not an easy answer. I’m interested in learning more about these kinds of things. I looked into DocBook a while back, but never moved forward with it. What resource would you suggest for learning it?

    Also, about the problem at hand, is there any way you could build a pre-commit hook that would try to compile the document, then fail the commit if it doesn’t complete successfully?

  2. Matt. The easiest way would be to contribute to Fedora Docs. 🙂 But seriously, the Fedora Documentation Guide and DocBook 5: The Definitive Guide are great resources.

    As for a pre-commit hook, that’s possible. One issue is that it takes some time to build a guide depending on the size and options. The other issue is that pre-commit hooks can be a bit ugly since you have to figure out what non-edited sources need to be copied into the temporary build directory. Also, pre-commit hooks don’t allow for the writer to see the results of a syntactically correct build to make sure it looks right. It is one possible solution, though.

  3. Everybody needs an editor.
    Even — or especially — documentation assembled automatically.

Leave a Reply

Your email address will not be published. Required fields are marked *