Blog Fiasco

August 17, 2010

How helpful is helpful enough?

Filed under: Linux,Musings — Tags: , , , , , , — bcotton @ 10:18 am

I have a confession: I am a compulsive favor-doer. When someone asks for my help, I have a hard time saying “no”. Since there’s only so much Ben to go around, this gives me a tendency to over-commit.  I recognize this as a problem, but I can’t help myself.  It’s in my nature to be helpful.

So how helpful should I be?  My wife works at the county library, and last week a gentleman was checking out a book about Linux.  In the course of small talk it came up that he’s trying to dual-boot Ubuntu and Windows 7.  Angie mentioned that I run Linux at home and he wrote down his phone number and e-mail address for her to give to me.  I’m not mad at her for it, she hasn’t committed me to anything, but it got me wondering.

My initial reaction was to e-mail the guy and introduce myself.  After all, I’m a nice guy and that’s what I do.  Then I realized that this would probably make me his go-to support.  I don’t mind helping people, but an open-ended commitment isn’t exactly what I’m in the market for.  I already do a fair bit of free work for strangers.  I answer questions in the #fedora IRC room, on LinuxForums.com and on Serverfault.  Additionally, I write this blog, and I write documentation for the Fedora Project.  Maybe I don’t do as much as others, but I’m definitely contributing back to the community.

So maybe, I thought, I should set a rate and charge him for help.  That seems like too much effort, though. I’m not really interested in doing enough consulting/contract work to make it worth the trouble of filing the appropriate paperwork.  Besides, I have such a hard time asking for a reward for being nice.

Where does that leave me then?  I have no idea.  In the meantime, I’ve gone with the head-in-sand approach.  I’ll just pretend like this never happened.  Perhaps someday I’ll be able to solve this quandary.

August 10, 2010

Sometimes, Windows wins

Filed under: HPC/HTC,Linux — Tags: , , , , , , , , , , , , , — bcotton @ 6:59 am

It should be clear by now that I am an advocate of free software.  I’m not reflexively against closed software though, sometimes it’s the right tool for the job.  Use of Windows is not a reason for mockery.  In fact, I’ve found one situation where I like the way Windows works better.

As part of our efforts to use Condor for power saving, I thought it would be a great idea if we could calculate the power savings based on the actual power usage of the machines.  The plan was to have Cycle Server aggregate the time in hibernate state for each model and then multiply that by the power draw for the model.  Since Condor doesn’t note the hardware model, I needed to write a STARTD_CRON module to determine this.  The only limitations I had were that I couldn’t depend on root/administrator privileges or on particular software packages being installed. (The execute nodes are in departments across campus and mostly not under my control.)

Despite the lack of useful tools like grep, sed, and awk (there are equivalents for some of the taken-for-granted GNU tools, but they frankly aren’t very good), the plugin for Windows was very easy.  The systeminfo command gives all kinds of useful, parseable information about the system’s hardware and OS.  The only difficult part was chopping the blank spaces off the end of the output. I wanted to do this in Perl, but that’s not guaranteed to be installed on Windows machines, and I had some difficulty getting a standalone-compiled version working consistently.

On Linux, parsing the output is easy.  The hard part was getting the information at all.  dmidecode seems to be ubiquitous, but it requires root privileges to get any information.  I tried lshw, lshal, and the entire /proc tree.  /proc didn’t have the information I need, and the two commands were not necessarily a part of the “base” install.  The solution seemed to be to require the addition of a package (or bundling a binary for lshw in our Condor distribution).

Eventually, we decided that it was more effort than it was worth to come up with a reliable module.  While both platforms had problems, Linux was definitely the more difficult.  It’s a somewhat rare condition, but there are times when Windows wins.

July 16, 2010

My TTYtter configuration

Filed under: Linux,The Internet,mac — Tags: , , — bcotton @ 8:07 am

It’s been many months since I found out about TTYtter, a command line Twitter client written in Perl.  Though some users might bemoan the lack of a snazzy graphical interface, it is that very lack which appeals to me.  TTYtter places only a very tiny load on system resources, which means my Twitter addiction won’t get in the way of running VMs to test various configurations and procedures.  Being command-line based, I can run it in a screen session which means that I can resume my Twittering from wherever I happen to be and not have to re-configure my client.

I don’t claim to be a TTYtter expert, but I thought I’d share my own configuration for other newbs.  TTYtter looks in $HOME/.ttytterrc by default, and here’s my default configuration:

#Check to see if I'm running the current version
vcheck=1
# What hash tags do I care about?
track='#Purdue #OSMacTalk #MarioMarathon'
# Colors, etc are good!
ansi=1
# I'm dumb. Prompt me before a tweet posts
verify=1
# Use some readline magic
readline=1
# Check for mentions from people I don't follow
mentions=1

Of course, there are certain times that the default configuration isn’t what I want.  When I was reading tweets in rapid-fire succession during the Mario Marathon, I didn’t want non-Mario tweets to get in the way, so I used a separate configuration file:

# Don't log in and burn up my rate limit
anonymous=1
# Find tweets related to the marathon
track=#MarioMarathon "Mario Marathon"
# Don't show my normal timeline
notimeline=1
# Colors, etc are awesome!
ansi=1
# Only update when I say so. This keeps the tweet I'm in the middle of reading
#      from being scrolled right off my screen
synch

There are a lot of other ways that TTYtter can be used, and I’m sure @doctorlinguist will tell me all of the ways I’m doing things wrong, but if you’re in the market for a new, multi-platform Twitter client, you should give this one a try.

July 9, 2010

The joys of doing it right

Filed under: Linux,Musings — Tags: , , , — bcotton @ 10:18 am

A while back, I wrote a post about why it’s not always possible to DoItRight™, and that sometimes you just have to accept it.  Today I’m here to talk about a time that I did something right and how good it felt.  Now, that’s not to say that I’m eternally screwing up (although a good quarter of my Subversion commits are fixes of a commit I previously made), but there’s a difference between making something work and making it work well.

I decided that since we have a Nagios server, I might as well have it check on the health of our Condor services.  From what I could tell, no such checks currently exist, so I decided to write my own.  Nagios checks can be very simple: run a command or two, and then return a number that means something to Nagios.  Many checks are written in bash or another shell script because they are so simple.  For my checks, I wanted to do some parsing of the command outputs to determine the state of job queues, etc.  Since that kind of work is a little heavy for a shell script, I opted to write it in Perl.  Yay Perl!

Since there aren’t any checks available, I thought my work might be useful to others in the community.  As a result, I wanted to make sure my code was respectable.  This meant I spent some time designing, coding, and testing options that we don’t want but others might find useful.  It meant putting extra documentation into the code (and eventually writing some pod before I share the code publicly).  It meant mostly following the coding style of the Linux kernel (I chose that because “why not?”).

Some readers will (correctly) note that the Linux kernel coding style does not guarantee good code.  I don’t mean to suggest that it does, but I’ve found that it forced me to think about my code more deeply than I otherwise would.  Not being a programmer, most of the code I write is to fit a small need of mine and the quality is defined as “does it do what I want it to?”  Writing something with the intent of sharing it publicly and forcing yourself to not cut corners can make the work more difficult, but the end result is a beauty to behold.

June 14, 2010

Ugly shell commands

Filed under: HPC/HTC,Linux — Tags: , , , , , , — bcotton @ 6:25 am

Log files can be incredibly helpful, but they can also be really ugly.  Pulling information out programmatically can be a real hassle.  When a program exists to extract useful information (see: logwatch), it’s cause for celebration.  The following is what can happen when a program doesn’t exist (and yes, this code actually worked).

The scenario here is that a user complained that Condor jobs were failing at a higher-than-normal rate.  Our suspicion, based on a quick look at his log files, is that a few nodes are eating most of his jobs.  But how to tell?  I’ll want to create a spreadsheet that has the job ID, the date, the time, and the last execute host for all of the failed jobs.  I could either task a student to manually pull this information out of the log files, or I can pull it out with some shell magic.

The first step was to get the job ID, the date, and the time from the user’s log files:

grep -B 1 Abnormal ~user/condor/t?/log100 | grep "Job terminated" | awk '{print $2 "," $3 "," $4 }' | sed "s/[\(|\)]//g" | sort -n > failedjobs.csv

What this does is to search the multiple log files for the word “Abnormal”, with one line printed before each match because that’s where the information we want is.  To pull that line out, we search for “Job terminated” and then pull out the second, third, and fourth fields, stripping the parentheses off of the job ID, sorting, and then writing to the file failedjobs.csv.

The next step is to get the last execute node of the failed jobs from the system logs:

for x in `cat failedjobs.csv | awk -F, '{print $1}'`; do
host=`grep "$x.*Job executing" /var/condor/log/EventLog* | tail -n 1 | sed -r "s/.*<(.*):.*/\1/g"`
echo "`host $host | awk '{print $5}'`" >> failedjobs-2.csv;
done

Wow.  This loop pulls the first field out of the CSV we made in the first step.  The IP address for each failed job is pulled from the Event Logs by searching for the “Job executing” string.  Since a job may execute on several different hosts in its lifetime, we want to only look at the last one (hence the tail command), and we pull out the contents of the angle brackets left of the colon.  This is the IP address of the execute host.

With that information, we use the host command to look up the hostname that corresponds to that IP address and write it to a file.  Now all that remains is to combine the two files and try to find something useful in the data.  And maybe to write a script to do this, so that it will be a little easier the next time around.

May 7, 2010

Perl’s CGI.pm popup_menu cares how you give it data

Filed under: Linux,Web design — Tags: , , — bcotton @ 11:53 am

Last weekend when I was working on the script that mirrors and presents radar data for mobile use, I decided the less work I had to do, the better.  To that end, I tried to make heavy use of the CGI.pm Perl module.  In addition to handling the CGI input, CGI.pm also prints regular HTML tags, so you can avoiding having to throw a bunch of HTML markup in your print statements.  This makes for much cleaner code and reduces the chances you’ll make a silly formatting mistake.

Everything was going well until I added the popup menu to select the radar product.  Initially I followed the example in the documentation and it worked.  As I went on, I decided instead of having two hashes for the product information, it made sense to make my hash include not only the product description, but the URL pattern I’d be using when it came time to mirror the image.  Unfortunately, when I tried to make that change, my popup form no longer had the labels I wanted.

I kept poking at it for a while and finally got frustrated to the point where I decided I’d just write a foreach and have that part print the HTML markup instead of using CGI.pm functions.  Fortunately, I first talked to my friend Mike about it.  I sent him the code and after a little bit of working, he realized what my problem was.  CGI.pm’s popup_menu function expects a pointer to a hash for labels, not an array (I’m not really sure why, maybe someone can explain it?).  Once that was settled, the script worked as expected and the remainder was finished in short order.

Sometimes, it really helps to pay attention to the data type that a function expects.

April 14, 2010

Playing Sopwith on the N900

Filed under: Linux — Tags: , , , — bcotton @ 9:04 am

Despite my involvement with Mario Marathon, I’m not much of a gamer.  I have more toes than I do games for my Wii, and only a few more (purchased) titles for my computers.  However, I’ve found that my phone gets the most gaming time simply because it is portable and I can play while I’m on the bus, waiting in line, etc.  The game that has been receiving the bulk of my attention these days is Sopwith.

Sopwith is a simple 2-D game where you must fly your biplane and destroy enemy buildings.  It has been ported to the Maemo platform by Mikko Vartiainen and can be installed from the Maemo Extras repository.  I had never played Sopwith before discovering this version, but my understanding is that it is very true to the original (it helps that the code was re-licensed under the GPL a few years ago) with the exception of a lack of sound.  The presence of missiles seems to be a relatively new and anachronistic feature that I can’t help but use. I never claimed to be good at the game.

Gameplay itself is quite addictive, and fortunately very simple — there are a total of 10 keys you might need to use, and I find myself only using six with any regularity.  The one disadvantage is that all of the keys are on the bottom row of the N900 keyboard, and I’ve found myself hitting the wrong key in the heat of battle. That usually ends up with a dead me.

Since Sopwith was originally written as a showcase for the “imaginet” network system, it makes sense that the Maemo version of Sopwith also has support for network games.  Unfortunately, since I’m the only person I know with an N900, I can’t test that aspect of it.  I imagine it would be fun, especially if you’re in the same room and and trade sharply-pointed barbs.

For those worried about it growing stale, there are several levels.  In the novice mode, there are at least three levels that get progressively more difficult.  I’ve nearly made it past the third level, but not quite.  In expert mode there’s at least one level, and you don’t get unlimited ammunition or automatic throttling.  There’s also an option for playing against a computer opponent, which appears to play in expert mode as well.

Sopwith clearly isn’t a sufficient reason to buy a Maemo device (if it is, then please send me some of your vast amounts of disposable income), but it is a great game to have installed for times when you need to kill a few minutes (and enemies).

April 7, 2010

Building GEMPAK on Fedora

Filed under: Linux,Weather — Tags: , , , — bcotton @ 8:21 am

People who have training or experience in following severe weather rarely are content to rely on the media for severe weather information.  This isn’t to say anything against the TV and radio outlets, but we would just rather see the data for ourselves and make our own decisions.  A number of software packages exist for analyzing weather data, of varying price and quality.  For Linux and OS X users (and Windows users via Cygwin), perhaps the most powerful is GEMPAK, developed by the National Weather Service and made available by Unidata.

GEMPAK is basically what national centers, including the Storm Prediction Center and the National Hurricane Center currently use for data analysis and visualization.  It is also a pretty old suite, written in pretty old Fortran, and kind of a bear to get installed sometimes.  Unidata formerly provided pre-built binaries, but only source code is available for the 5.11.4 release.  Since there seems to be an absence of step-by-step instructions, I thought I’d post my own.  Of course, as soon as I post this, a public release of AWIPS II will be announced.

The first step is to prepare your environment.  This basically consists of creating the appropriate user and/or directory structure.  For simplicity and security reasons, I prefer not to create a separate GEMPAK user.  I prefer to just build it in a directory that I have access to and then move it to opt when I’m done.  If your account will be the only one using GEMPAK, you can just leave it in your home directory.  For the sake of this demonstration, let’s assume we’re doing it my way.  Then there’s no pre-work to do.

Next you’ll need to download the GEMPAK source code from Unidata’s website.  Unidata requires registration to download the software, but it is free and they don’t harass you.  While you’re waiting for the tarball to download, you’ll need to make sure you’ve got a few of the packages necessary for building GEMPAK.  Most of them can be found in the yum repository, so you can install them with:

su -c "yum install gcc-gfortran libXp-devel  libX11-devel xorg-x11-fonts-ISO8859-1-75dpi"

You’ll also need OpenMotif packages, but they can’t be newer that 2.3.1.  This means you cannot install the packages from the RPMFusion-nonfree repository.  You’ll have to grab them manually from MotifZone.  There aren’t builds for recent Fedora releases on there, but the Fedora 9 RPMs work fine with Fedora 12.  Since we’ve installed these RPMs, you need to make sure yum won’t try to upgrade them, so make sure you exclude “openmotif*” in /etc/yum.conf.

By now, the tarball should be downloaded.  You’ll need to unpack it just like you would any other. Let’s assume you downloaded it to ~/Download, so go to the directory you want to build GEMPAK in and run:

tar xfz ~/Download/gempak_upc5.11.4.tar.gz

Now cd into the GEMPAK5.11.4 directory you just created.  The first thing you’ll need to do is edit the Gemenviron.profile (if you use the Bash shell) or Gemenviorn (for C-shell).  This file sets the myriad of environment variables that GEMPAK uses.  Fortunately, it is sanely built, so you only need to make a few changes.  Change the NAWIPS variable to point to the directory you’re building in (for example, $HOME/GEMPAK5.11.4).  USE_GFORTRAN should already be set to 1, but double-check that it is and that the references to USE_G77 and USE_PGI right below it are commented out.  Eventually, you’ll also need to make sure the GEMDATA and LDMDATA variables are set correctly, but that’s beyond the scope of this how-to.

After you’ve saved the Gemenviron[.profile] file, you’ll need to make sure config/Makeinc.common is correct (it probably is) and you’ll need to edit config/Makeinc.linux_gfortran (or config/Makeinc.linux64_gfortran if you’re on an x86_64 system. You can check that with `uname -i`).  Change the MOTIFINC variable to “-I/usr/include/openmotif” and change X11LIBDIR to “-L/usr/lib/openmotif” (replace “lib” with “lib64″ for x86_64 systems).

There’s one more step to get things working, which appears to be a result of Fedora 12 using a newer version of autotools than the source was prepared with.  You”ll need to run the following command in both the extlibs/netCDF/v3.6.2 and extlibs/JasPer/jasper directories (return to the GEMPAK5.11.4 directory when you’re done):

autoreconf --force --install --symlink

Now that that’s done, you’ll need to load the GEMPAK environment.  For the Bash shell, use `. Gemenviron.profile` and for the C-shell use `source Gemenviron`.  The next step is to actually build the programs.  It is a lengthy process (I timed it at about 8 minutes on my system), so you might want to go get a cup of coffee or something.  Since it produces a lot of output, we want to make sure we can go back and look through it if there are any problems (it took about 5 build attempts before I got these instructions to a workable state), so we’ll save the output to a file called make.out.

make >& make.out

Once that’s completed, it’s time to install the files and clean up after ourselves:

make install; make clean

If you plan to move the built GEMPAK into somewhere like /opt, now’s the time to do that (make sure you update the NAWIPS variable in Gemenviron[.profile] appropriately).  I prefer to move it to /opt and then create an /opt/gempak symbolic link to the GEMPAK5.11.4 directory so that if I install a different version, I just need to change my link and everything else works the same.  For ease of use, you should set your shell configuration file to call the Gemenviron[.profile] script at login if you plan to use GEMPAK frequently.

That’s all there is, it’s ready to use now.

Solving the CUPS “hpcups failed” error

Filed under: Linux — Tags: , , , , , , — bcotton @ 7:03 am

I thought when I took my new job that my days of dealing with printer headaches were over.  Alas, it was not to be.  A few weeks ago, I needed to print out a form for work.  I tried to print to the shared laser printer down the hall.  Nothing.  So I tried the color printer. Nothing again.  I was perplexed because both printers had worked previously, so being a moderately competent sysadmin, I looked in the CUPS logs.  I saw a line in error_log that read printer-state-message="/usr/lib/cups/filter/hpcups failed". That seemed like it was the problem, so I tried to find a solution and couldn’t come up with anything immediately.

Since a quick fix didn’t seem to be on the horizon, I decided that I had better things to do with my time and I just used my laptop to print.  That worked, so I forgot about the printing issue.  Shortly thereafter, the group that maintains the printers added the ones on our floor to their CUPS server.  I stopped CUPS on my desktop and switched to their server and printing worked again, thus I had even less incentive to track down the problem.

Fast forward to yesterday afternoon when my wife tried to print a handbill for an event she is organizing in a few weeks.  Since my desktop at home is a x86_64 Fedora 12 system, too, it didn’t surprise me too much when she told me she couldn’t print.  Sure, enough, when I checked the logs, I saw the same error.  I tried all of the regular stalling tactics: restarting CUPS, power cycling the printer, just removing the job and trying again.  Nothing worked.

The first site I found was an Ubuntu bug report which seemed to suggest maybe I should update the printer’s firmware.  That seemed like a really unappealing prospect to me, but as I scrolled down I saw comment #8.  This suggested that maybe I was looking in the wrong place for my answer.  A few lines above the hpcups line, there was an error message that read prnt/hpcups/HPCupsFilter.cpp 361: DEBUG: Bad PPD - hpPrinterLanguage not found.

A search for this brought me to a page about the latest version of hplip. Apparently, the new version required updated PPD files, which are the files that describe the printer to the print server.  In this case, updating the PPD file was simple, and didn’t involve having to find it on HP’s or a third-party website.  All I had to do was use the CUPS web interface and modify the printer, keeping everything the same except selecting the hpcups 3.10.2 driver instead of the 3.9.x that it had been using.  As soon as I made that change, printing worked exactly as expected.

The lesson here, besides the ever-present “printing is evil” is that the error message you think is the clue might not always be.  When you get stuck trying to figure a problem out, look around for other clues.  Tunnel vision only works if you’re on the right track to begin with.

March 31, 2010

One danger of the N900′s flexibility

Filed under: Linux — Tags: , , — bcotton @ 8:08 am

When I first heard about Nokia’s super-awesome Debian-based N900 amazingphone, my first two thoughts were “I want this phone” and “man, I hope I don’t break it.”  By offering access to the full OS, the N900 allows developers and users the freedom to make their phone into whatever they want.  That also means an unprecedented ability to turn the phone into a very expensive paperweight.  Just as with a full-sized computer, a user not paying attention to what’s going on could render the system inoperable in a heartbeat.  Even the relatively competent could find themselves in that situation.

The chief drawback of the N900 is the way the disk space is layed out.  The device has 32 gigabytes of storage, but most of that is on a separate partition for user data.  The system and application space is only 2 gigabytes.  This means that as the number of installed applications grows, the available space shrinks rather quickly.  It can get to the point where there’s not enough space to store repository data and some of the software repositories become unavailable in the Application Manager.

In order to work around this issue, I copied /var/apt/cache to the user space and made a symlink to point to the new location. After a while, the space became tight again, so I started moving more things out, and eventually nearly all of /usr/lib ended up on the user partition.  Everything seemed fine, so I wasn’t concerned.

Then one day, I couldn’t connect to the wireless network at work.  I figured I’d try rebooting the phone to see if that helped.  Instead, I got the boot screen with some not-so-friendly text: “Device error”.  Nothing I could do could get the phone to boot.  Presumably what happened is that some of the files that live in /usr/lib need to be accessed before the /home partition is mounted and since they lived on /home, things just went kablooie.

Regardless of why, I found myself stuck, and I knew I had to get it fixed quickly or my wife would never let me hear the end of it.  Fortunately, Nokia provides tools to re-flash the device firmware.  The process was quick and painless and in a few minutes, I had my phone back.  Because user data is kept separately, I still had my contacts, music, etc. I just needed to re-install the software packages I wanted.

One thing of note is that my SIM card wasn’t read initially. I needed to do a round of software updates before it would work.  Fortunately, I had the phone back to full use in about an hour.  It’s good to know that I can easily unbreak any software issues I cause, and I hope the application data space issue is resolved in one form or another in the future.

Older Posts »

Powered by WordPress