Other writing in January 2017

Where have I been writing when I haven’t been writing here?

The Next Platform

I’m freelancing for The Next Platform as a contributing author. Here are the articles I wrote last month:

Opensource.com

Over on Opensource.com, we had our fourth consecutive month with a milion-plus page views and set a record with 1,122,064. I wrote the articles below.

Also, the 2016 Open Source Yearbook is now available. You can get a free PDF download now or wait for the print version to become available. Or you can do both!

Cycle Computing

Meanwhile, I wrote or edited a few things for work, too:

  • Use AWS EBS Snapshots to speed instance setup — Staging reference data can be a time-expensive operation. This post describes one way we cut tens of minutes off of time for a cancer research workload.
  • Various ghost-written pieces. I’ll never tell which ones!

Hints for using HTCondor’s credd and condor_store_cred

HTCondor has the ability to run jobs as either an unprivileged “nobody” user or as the submitting user. On Linux, enabling this is fairly easy: the administrator just sets the UID_DOMAIN configuration to the same value and away you go. On Windows, you need to run the credential daemon (condor_credd) and the user must send store credentials using condor_store_cred.

The manual does a pretty good job of describing the basic setup of the credd, though there are some important pieces missing. With help from HTCondor technical lead Todd Tannenbaum, I’ve submitted some improvements to the docs, but in the meantime…

The main thing to consider when configuring your pool to use the credd is that it wants things to be secure. That makes sense, considering its entire job is to securely store and transfer user credentials. The credd will not hand out the password unless the client is authenticated and using a secure connection. The method of authentication is not important (if you really, really trust your network, you can use the CLAIMTOBE method), so long as authentication occurs somehow.

So where do the condor_store_cred hints come in? Often, the credd runs on the same machine as the schedd, and users log in to there to submit jobs. In that case, everything’s probably fine. But if you’re submitting jobs from a machine outside the pool (for example, a user’s workstation), it can get a little hairier.

Before running condor_store_cred, HTCondor needs to be told where to look for the credd, and the client settings mentioned above need to meet the credd’s requirements. (I’m using CLAIMTOBE here for simplicity). If the machine the user submits from is not in the pool, condor_store_cred will need to know where to find the collector, too.

CREDD_HOST = scheduler.example.com
COLLECTOR_HOST = centralmanager.example.com
SEC_CLIENT_AUTHENTICATION_METHODS = CLAIMTOBE
SEC_CLIENT_AUTHENTICATION = PREFERRED
SEC_CLIENT_ENCRYPTION = PREFERRED

As of this writing, condor_store_cred gives an unhelpful error message if something goes wrong. It will always say “Make sure your ALLOW_WRITE setting includes this host.”, so if your ALLOW_WRITE setting already includes the host in question, you might get stuck. Use the -debug option to get better output. For example:

02/16/16 12:23:51 STORE_CRED: In mode 'query'
02/16/16 12:23:51 Warning: Collector information was not found in the configuration file. ClassAds will not be sent to the collector and this daemon will not join a larger Condor pool.
02/16/16 12:23:51 STORE_CRED: Failed to start command.
02/16/16 12:23:51 STORE_CRED: Unable to contact the REMOTE schedd.

This tells you that you forgot to set the COLLECTOR_HOST in your configuration.

Another hint is that if your scheduler name is different than the machine name (e.g. if you run multiple condor_schedd processes on a single machine and have Q1@hostname, Q2@hostname, etc), you might need to include “-name Q1@hostname” in the arguments. Unlike most other HTCondor client commands, you cannot specify a “sinful string” as a target using the “-addr” option.

Hopefully this helps you save a little bit of time getting run_as_owner working on your Windows pool, until such time as I sit down to write that “Administering HTCondor” book that I’ve been meaning to work on for the last 5 years.

Upgrading Windows in VirtualBox

For work, I have the occasional need to use Windows. I started out with a Windows XP virtual machine, and when that went end-of-life, I upgraded it to Windows 7. The “upgrade” process was really more of a reinstall-but-with-your-old data preserved. I had to reinstall many of the applications (including Python, though I didn’t realize it at the time).

I don’t pay much attention to the Windows ecosystem, but I had heard that a lot of unseen improvements took place in 7 and beyond which should make the upgrade process easier, so when it was time to upgrade because Windows 7 was (past, oops!) end-of-life, I wasn’t too worried.

Windows has a nice nag every time you log in saying “upgrade to Windows 10!” so I did. But it didn’t like the VirtualBox video driver. I tried reinstalling the Guest Additions, but that didn’t work. I tried uninstalling the driver, but Windows helpfully reinstalled it after a reboot. A forum post suggested I should enable 3D acceleration. Still no luck.

What eventually worked was to update to VirtualBox 5.0.12 (I had been on .10) and then install the latest Guest Additions. Instead of using the “upgrade me!” pestering, I had to download the Windows 10 media creation tool and use that to perform the upgrade.

Once I got those steps figured out, the upgrade process was pretty painless (if a little slow). Everything seemed to work fairly well, though I haven’t given it an in-depth trial yet. I do like the return to a more traditional UI, the tiles of Windows 8 work great on a phone, but I don’t like them on a desktop (and even worse on a server).

Sometimes, Windows wins

It should be clear by now that I am an advocate of free software.  I’m not reflexively against closed software though, sometimes it’s the right tool for the job.  Use of Windows is not a reason for mockery.  In fact, I’ve found one situation where I like the way Windows works better.

As part of our efforts to use Condor for power saving, I thought it would be a great idea if we could calculate the power savings based on the actual power usage of the machines.  The plan was to have Cycle Server aggregate the time in hibernate state for each model and then multiply that by the power draw for the model.  Since Condor doesn’t note the hardware model, I needed to write a STARTD_CRON module to determine this.  The only limitations I had were that I couldn’t depend on root/administrator privileges or on particular software packages being installed. (The execute nodes are in departments across campus and mostly not under my control.)

Despite the lack of useful tools like grep, sed, and awk (there are equivalents for some of the taken-for-granted GNU tools, but they frankly aren’t very good), the plugin for Windows was very easy.  The systeminfo command gives all kinds of useful, parseable information about the system’s hardware and OS.  The only difficult part was chopping the blank spaces off the end of the output. I wanted to do this in Perl, but that’s not guaranteed to be installed on Windows machines, and I had some difficulty getting a standalone-compiled version working consistently.

On Linux, parsing the output is easy.  The hard part was getting the information at all.  dmidecode seems to be ubiquitous, but it requires root privileges to get any information.  I tried lshw, lshal, and the entire /proc tree.  /proc didn’t have the information I need, and the two commands were not necessarily a part of the “base” install.  The solution seemed to be to require the addition of a package (or bundling a binary for lshw in our Condor distribution).

Eventually, we decided that it was more effort than it was worth to come up with a reliable module.  While both platforms had problems, Linux was definitely the more difficult.  It’s a somewhat rare condition, but there are times when Windows wins.