Objects in the shell: why PowerShell’s design makes sense to me

A while back, a friend said “PowerShell is what happens when you ask a bunch of drunk lizards to make Bash shitty.” Another friend replied that his understanding is that PowerShell is driven by a desire to “modernize” the shell by piping objects instead of strings. Now knowing that I got my start as a Unix and Linux sysadmin, you might expect me to take the “it’s Bash, except awful” side. But you’d be wrong.

Full disclosure: I have not used PowerShell in any meaningful sense. But from what I know about it, it represents a real improvement over the traditional Unix shell (of whatever flavor) for certain use cases. Some sysadmins lionize the shell script as the pinnacle of sysadminry. This is in part because it’s what we know and also because it’s hard. Oh sure, writing trivial scripts is easy, but writing good, robust scripts? That can be a challenge.

Shell scripts are a glue language, not a programming language (yes, you can write some really complicated stuff in shell scripts, but really what you’re doing is gluing together other commands). PowerShell, in my view, is closer to a programming language that you can script with. This fits well with the evolution in systems administration. Sysadmins in most environments are expected to be able to do some light programming at a minimum. We’re moving to a world where the API is king. Provisioning and configuring infrastructure is now a very code-heavy exercise.

The object focus of PowerShell is truly a compelling feature. I think about all the times I’ve had to use awk, sed, cut, and others to slice up the output of a command in order to feed selective parts into the next or to re-order the output. A machine-parseable medium like JSON or XML makes programmatic gluing much easier.

When running interactively in the shell, strings are much easier for humans to deal with. In those cases, convert the objects to strings. But crippling machine output for the sake of humans doesn’t seem productive. At least not when both can get what they need.

If the Unix shell were being designed today, I think it would have some very PowerShell-like features. PowerShell has the advantage of being a latecomer. As such, it can learn from mistakes of the past without being constrained by legacy limitations. Microsoft is serious about making Windows a DevOps-ready platform, as Jeffrey Snover said at LISA 16. To do this requires a break from historical norms, and that’s not always bad.

Ugly shell commands

Log files can be incredibly helpful, but they can also be really ugly.  Pulling information out programmatically can be a real hassle.  When a program exists to extract useful information (see: logwatch), it’s cause for celebration.  The following is what can happen when a program doesn’t exist (and yes, this code actually worked).

The scenario here is that a user complained that Condor jobs were failing at a higher-than-normal rate.  Our suspicion, based on a quick look at his log files, is that a few nodes are eating most of his jobs.  But how to tell?  I’ll want to create a spreadsheet that has the job ID, the date, the time, and the last execute host for all of the failed jobs.  I could either task a student to manually pull this information out of the log files, or I can pull it out with some shell magic.

The first step was to get the job ID, the date, and the time from the user’s log files:

grep -B 1 Abnormal ~user/condor/t?/log100 | grep "Job terminated" | awk '{print $2 "," $3 "," $4 }' | sed "s/[\(|\)]//g" | sort -n > failedjobs.csv

What this does is to search the multiple log files for the word “Abnormal”, with one line printed before each match because that’s where the information we want is.  To pull that line out, we search for “Job terminated” and then pull out the second, third, and fourth fields, stripping the parentheses off of the job ID, sorting, and then writing to the file failedjobs.csv.

The next step is to get the last execute node of the failed jobs from the system logs:

for x in `cat failedjobs.csv | awk -F, '{print $1}'`; do
host=`grep "$x.*Job executing" /var/condor/log/EventLog* | tail -n 1 | sed -r "s/.*<(.*):.*/\1/g"`
echo "`host $host | awk '{print $5}'`" >> failedjobs-2.csv;
done

Wow.  This loop pulls the first field out of the CSV we made in the first step.  The IP address for each failed job is pulled from the Event Logs by searching for the “Job executing” string.  Since a job may execute on several different hosts in its lifetime, we want to only look at the last one (hence the tail command), and we pull out the contents of the angle brackets left of the colon.  This is the IP address of the execute host.

With that information, we use the host command to look up the hostname that corresponds to that IP address and write it to a file.  Now all that remains is to combine the two files and try to find something useful in the data.  And maybe to write a script to do this, so that it will be a little easier the next time around.