Blog Fiasco

March 30, 2014

Parsing SGE’s qacct dates

Filed under: HPC/HTC,mac — Tags: , , , , — bcotton @ 9:19 pm

Recently I was trying to reconstruct a customer’s SGE job queue to understand why our cluster autoscaling wasn’t working quite right. The best way I found was to dump the output of qacct and grep for {qsub,start,end}_time. Several things made this unpleasant. First, the output is not de-duplicated on job id. Jobs that span multiple hosts get listed multiple times. Another thing is that the dates are in a nearly-but-not-quite “normal” format. For example: “Tue Mar 18 13:00:08 2014″.

What can you do with that? Not a whole lot. It’s not a format that spreadsheets will readily treat as a date, so if you want to do spreadsheety things, you’re forced to either manually enter them or write a shell function to do it for you:

function qacct2excel { echo "=`date -f '%a %b %d %T %Y' -j \"$1\"  +%s`/(60*60*24)+\"1/1/1970\"";

The above works on OS X because it uses a non-GNU date command. On Linux, you’ll need a different set of arguments, which I haven’t bothered to figure out. It’s still not awesome, but it’s slightly less tedious this way. At some point, I might write a parser that does what I want qacct to do, instead of what it does.

It’s entirely possible that there’s a better way to do this. The man page didn’t seem to have any helpful suggestions, though. I hate to say “SGE sucks” because I know very little about it. What I do know is that it’s hard to find good material for learning about SGE. At least HTCondor has thorough documentation and tutorials from HTCondor Week posted online. Perhaps one of these days I’ll learn more about SGE so I can determine whether it sucks or not.

January 15, 2013

Deploying Fedora 18 documentation: learning git the hard way

Filed under: Linux,The Internet — Tags: , , , , — bcotton @ 5:49 pm

If you haven’t heard, the Fedora team released Fedora 18 today. It’s the culmination of many months of effort, and some very frustrating schedule delays. I’m sure everyone was relieve to push it out the door, even as some contributors worked to make sure the mirrors were stable and update translations. I remembered that I had forgotten to push the Fedora 18 versions of the Live Images Guide and the Burning ISOs Guide, so I quickly did that. Then I noticed that several of the documents that were on the site earlier weren’t anymore. Crap.

Here’s how the Fedora Documentation site works: contributors write guides in DocBook XML, build them with a tool called publican, and then check the built documents into a git repository. Once an hour, the web server clones the git repo to update the content on the site. Looking through the commits, it seemed like a few hours prior, someone had published a document without updating their local copy of the web repo first, which blew away previously-published Fedora 18 docs.

The fix seemed simple enough: I’d just revert to a few commits prior and then we could re-publish the most recent updates. So I git a `git reset –hard` and then tried to push. It was suggested that a –force might help, so I did. That’s when I learned that this basically sends the local git repo to the remote as if the remote were empty (someone who understands git better would undoubtedly correct this explanation), which makes sense. For many repos, this probably isn’t too big a deal. For the Docs web repo, which contains many images, PDFs, epubs, etc. and is roughly 8 GB on disk, this can be a slow process. On a residential cable internet connection which throttles uploads to about 250 KiB/s after the first minute, it’s a very slow process.

I sent a note to the docs mailing list letting people know I was cleaning up the repo and that they shouldn’t push any docs to the web. After an hour or so, the push finally finished. It was…a failure? Someone hadn’t seen my email and pushed a new guide shortly after I had started the push-of-doom. Fortunately I discovered the git revert command in the meantime. revert, instead of pretending like the past never happened, makes diffs to back out the commit(s). After reverting four commits and pushing, we were back to where we were when life was happy. It was simple to re-publish the docs after that, and a reminder was sent to the group to ensure the repo is up-to-date before pushing.

The final result is that some documents were unavailable for a few hours. The good news is that I learned a little bit more about git today. The better news is that this should serve as additional motivation to move to Publican 3, which will allow us to publish guides via RPMs instead of an unwieldy git repo.

October 2, 2012

Proposed change to Fedora docs schedule

Filed under: Linux — Tags: , , , — bcotton @ 7:59 pm

In the most ideal situations, project management can be a challenge. In a fast-moving, community-driven, highly-visible project like Fedora, it can be near-madness. Although the Docs group tries to keep to the schedule when producing guides, the unfortunate reality is that guides rarely meet the scheduled milestones. I’m a firm believer in the idea that if the schedule is never met, it’s the schedule that’s wrong, not the contributors.

With that in mind, it was proposed in this week’s Docs meeting that the branching of guides be 4 weeks prior to the final release date, instead of shortly after the Alpha release. This buys guide writers more time to incorporate the changes a new release brings. Unfortunately, it puts an extra crunch on translators since they then only have four weeks to update translations if guides are to ship in concert with the OS.

The reality is that it doesn’t change much, I expect. Since the reality is that the guides don’t come in on time, it does little harm to have the schedule reflect that fact. I hope. I posted a message to the Docs and Translation teams earlier this week requesting comment. Modified schedule proposals are available at http://bcotton.fedorapeople.org/docs_schedule/.

May 23, 2012

To automate or not to automate?

Filed under: Linux,Musings — Tags: , , , — bcotton @ 6:25 am

According to HackTux.com, the eighth sysadmin commandment is

VIII. Thou shalt automate

Do more, faster by automating tasks. You are the puppet master, do not waste your time with menial work. Work smarter, not harder.

It makes sense to automate tasks. Automation reduces the number of opportunities for human error and it makes processes repeatable. These are good things, but like any ideological absolute there are exceptions. One issue I’m currently wrestling with is whether or not the builds of internal documentation at work should be automated. There are compelling arguments for both choices, and it’s not clear to me which one wins out.

Since I’m the one who is leading the documentation charge, I got to pick the technology. My work in the Fedora Documentation group has made me familiar with DocBook XML and Publican, so that’s what I chose. The combination of the two allows for a single source (that can be edited in any editor and neatly managed with your version control system of choice) that can be built into HTML, PDF, or epub. While I wouldn’t call publican “fragile”, it can be a bit verbose in its output. Sometimes the output can be helpful, and sometimes it can be easily ignored, and I’ve not yet come up with an easy way to programmatically distinguish the two. More importantly, the output can give a clear indication that the syntax is invalid (although it’s not always easy to figure out what the precise problem is).

The arguments in favor of automated builds are championed by one of the group’s managers. His concern is that even a well-documented build process may be a deterrent to some group members. Automating the builds removes the opportunity to make mistakes and makes the process simpler. On the other hand, it presents additional moving parts which may break in new and interesting ways.

There are two ways to handle the automated builds. First, every time a document is updated in the version control tree, a post-commit hook builds the new version. This has the ability to provide nearly-instant feedback on whether or not the document is semantically valid and has the correct layout and content. On the other hand, it may now require multiple commits to finally get the document correct. The writer could, of course, run publican locally before committing the changes, but if they’re going to do that, why have an automated process repeat the work? This method also contradicts the “commit early, commit often” philosophy that I embrace.

Instead of triggering builds based on commits, the build can be done on a schedule. This prevents multiple concurrent build attempts due to rapid commits, at the expense of losing quick feedback. In addition, it can muddy the “who broke this doc?” waters. If there are multiple commits between builds, it is potentially more difficult to hunt down the offending error (or even worse, errors). It also absolves writers of responsibility — since the entire group must be notified that the build failed, everyone will assume someone else will fix it. As a result, the document doesn’t get fixed.

I’m sympathetic to the pro-argument builds. On the other hand, the Fedora Docs are manually built and published (granted there are other considerations like release cycles and translation). Having to run and correct builds by hand helps train writers to write valid syntax the first time through and it allows documents to be updated when they’re ready, while still giving writers the ability to commit works in progress. Useful documents are almost never write-once, read-many. Even the U.S. Constitution has had 27 patches accepted. Still, for the kinds of documentation intended (e.g. design documents), once a document is initially written, it shouldn’t change often. This suggests that designing an automated build system is over-engineering, especially considering the current staff shortage.

It would seem that my current inclination is to keep with a manual build process, even if it is heresy. Fortunately, this may be a solved problem in the not-so-distant future. Red Hat’s Joshua Wulf (and others) have been developing a live DocBook editor that can be done entirely in the web browser. The idea is to lower the barrier to entry for Fedora contributors. This is still in the proof-of-concept stage, but the initial results are promising. In the meantime, I’ll work on getting the content seeded. If someone else in the group wants to automate it, I won’t stand in the way.

December 24, 2011

Fedora Docs QA

Filed under: Linux — Tags: , , — bcotton @ 11:43 pm

The Fedora Documentation team has been kicking around the idea of adding a QA process for a while. The idea is to catch errors in formatting, spelling, and grammar before a guide ships. The challenge is that all of our manpower is volunteer. They have jobs and lives, and don’t always have the time to do extra steps, even when those are needed. This problem exists for all community-driven projects, but it’s my first time experiencing it in a leadership position.

Fortunately, there are some great contributors. I think the QA process we developed will work really well. The main challenge will be doing QA-by-committee, which will require me to be a nagbot when tickets go stale. I will rely heavily on the bugzilla command line tool to nag me to nag the QA group. After Fedora 17 is released, we’ll revisit our process and see what needs to be fixed.

November 1, 2011

How should we deliver documentation to users?

Filed under: Linux — Tags: , , — bcotton @ 6:44 am

Dear Fedora Diary,

I’ve noticed that many people on Fedora Planet are posting daily-ish activity logs. I don’t contribute on a daily basis, so uninterested readers won’t have to put up with too much, but it seems like a reasonable thing to do. In the future, I’ll call it “Fedora Diary”, but today there are actual things to discuss.

It all started at 9 AM yesterday when I showed up in IRC for the Docs meeting. None of the usual leaders-of-meeting were around, so I took initiative and ran the meeting. I’d done this once before, so I didn’t screw up too badly. I’d say I need more practice, but I don’t want anyone to get any ideas.

At the end of the meeting, Pete made a comment about how cluttered the wiki is. His point is valid. With end user documentation mixed in with  internal minutes, agendas, etc., it can be challenging to find the right page. Even with a good search term, it’s not always evident which page is the right page, and that can frustrate users who just want to find the relevant nugget to fix their immediate problem.

Petr suggested that all user-intended documentation should be in the guides produced by the Docs team. I understand the sentiment, but I’m not sure it works. Certainly there’s a place for user documentation on a wiki, even if it gets reproduced in a guide. The real issue is that guides and wikis are different tools, and have different purposes. I’m not sure that there’s an advantage to exclusivity in medium.

This brings me to what I consider the more interesting question. How do users get the most benefit from documentation? I don’t know what sort of research (if any) exists on the subject, but it seems like something that should get figured out. Perhaps that’s a reasonable direction for my M.S. thesis?

August 17, 2010

How helpful is helpful enough?

Filed under: Linux,Musings — Tags: , , , , , , — bcotton @ 10:18 am

I have a confession: I am a compulsive favor-doer. When someone asks for my help, I have a hard time saying “no”. Since there’s only so much Ben to go around, this gives me a tendency to over-commit.  I recognize this as a problem, but I can’t help myself.  It’s in my nature to be helpful.

So how helpful should I be?  My wife works at the county library, and last week a gentleman was checking out a book about Linux.  In the course of small talk it came up that he’s trying to dual-boot Ubuntu and Windows 7.  Angie mentioned that I run Linux at home and he wrote down his phone number and e-mail address for her to give to me.  I’m not mad at her for it, she hasn’t committed me to anything, but it got me wondering.

My initial reaction was to e-mail the guy and introduce myself.  After all, I’m a nice guy and that’s what I do.  Then I realized that this would probably make me his go-to support.  I don’t mind helping people, but an open-ended commitment isn’t exactly what I’m in the market for.  I already do a fair bit of free work for strangers.  I answer questions in the #fedora IRC room, on LinuxForums.com and on Serverfault.  Additionally, I write this blog, and I write documentation for the Fedora Project.  Maybe I don’t do as much as others, but I’m definitely contributing back to the community.

So maybe, I thought, I should set a rate and charge him for help.  That seems like too much effort, though. I’m not really interested in doing enough consulting/contract work to make it worth the trouble of filing the appropriate paperwork.  Besides, I have such a hard time asking for a reward for being nice.

Where does that leave me then?  I have no idea.  In the meantime, I’ve gone with the head-in-sand approach.  I’ll just pretend like this never happened.  Perhaps someday I’ll be able to solve this quandary.

Powered by WordPress