Wednesday, February 24, 2010

Thursday, February 11, 2010

Solaris Zones + pkgserv Hose Zone Boots (Nerdity Level: High)

So SunOracle Solaris has Zones, which are a light-weight virtual server platform that we use a lot at work.

We're finally rebuilding our "chassis" servers -- the ones that host all of our virtual Solaris servers (mostly web and mail apps). Our normal zone creation script kept crapping out )at the point of first booting the new zone) with

zoneadm: zone 'ZONE': These file-systems are mounted on subdirectories of /fs/zones-1/ZONE/root:
zoneadm: zone 'ZONE':   /fs/zones-1/ZONE/root/var/sadm/install/.door
zoneadm: zone 'ZONE': call to zoneadmd failed

(I've redacted the hostname to protect my company. :-))

The first boot of the zone happens — or rather, fails to happen — after zoneadm -z ZONE install and some file copies into the new zone directory tree.

Doing Internet searches for these strings doesn't really help much.

After many hours of digging, I discovered that I could run zoneadm -z ZONE boot and get the zone to boot — but only if I waited quite a while after doing the zoneadm -z ZONE install — at least ten minutes.

I was able to see that a pkgserv process was running:

root 23107 1   0 14:55:27 ?  0:02 pkgserv -d /fs/zones-1/ZONE/root/var/sadm/install -N pkgadd

— and that took between 3 and 4 minutes to quit out.

Now, pkgserv seems to be part of an effort to speed up building and patching zones, which can take a long time. The particular setup we're seeing appears to have shown up with the 2010-01-08 Recommended Patch Cluster (though it might have shown up earlier -- we stepped from the 2009-05-08 cluster to the 2010-01-08 cluster, so if it showed up earlier we wouldn't have seen it).

Once pkgserv finally quit, if I immediately tried zoneadm -z ZONE boot or zoneadm -z ZONE ready, it would give me the same "call to zoneadmd failed". truss showed me that a call to zone_create() was failing with EBUSY, and that was propagating up the stack. The thing that's bizarre is that it never seemed to clear. (If I left it alone, it would eventually clear [as I saw empirically] but I never actually managed to pin down how long the error would take to clear — the loop time was way too long.) I think that running zoneadm -z ZONE ready actually prolongs the error.

I finally gave up and tried

umount -f /fs/zones-1/ZONE/root/var/sadm/install/.door

(plain umount didn't work), and that magically cleared the problem. Both umount and umount -f threw errors, too:

umount: warning: /fs/zones-1/ZONE/root/var/sadm/install/.door not in mnttab
umount: /fs/zones-1/ZONE/root/var/sadm/install/.door not mounted

(The door file was never showing in /etc/mnttab or in mount output. I could never find a clean way to find the mount.

I think it's clear that the pkgserv setup is a bit buggy and needs to be fixed.

How Gay is the Super Bowl?

Mark Dery asks How Gay is the Superbowl?

Wednesday, February 10, 2010

What the Oracle-Sun Merger says about U.S. Politics

Interesting article from a Sun employee laid off in Germany, and how the Sun-Oracle merger and resulting layoffsreductions in force illuminate the political landscape in the US.

Tuesday, February 09, 2010

Context Sensitivity

If you were going to install a new version of, say, apache, and saw

lrwxrwxrwx   1 root root   13 Jan  9  2009 apache -> 
                                           apache-1.3.29/
drwxr-sr-x   9 root root  512 May 10  2001 apache-1.3.26/
drwxr-sr-x   9 root root  512 Jan 27  2004 apache-1.3.29/

in /usr/local, how would you install your new version of apache? I would hope that you wouldn't just install it into an "apache" directory like this:

drwxr-sr-x  10 root root 512 Feb  9 09:38 apache/
drwxr-sr-x   9 root root 512 May 10  2001 apache-1.3.26/
drwxr-sr-x   9 root root 512 Jan 27  2004 apache-1.3.29/

I swear, this needs to be an interview question with a hard FAIL mode (as in "Sorry, we're done. Goodbye."). If you can't figure out to preserve the existing pattern, you shouldn't be a sysadmin.

Monday, February 08, 2010

Frustration

Dear Firefox,

Please don't suddenly and randomly flip the value of the "Always use the cursor keys to navigate within pages" option. Having the down-arrow key change from "scroll the window down a few pixels" to "jump to the next link" is brain-shearingly annoying.

 

 

Dear Web Application Programmers,

If your web page presents a very complex form to fill out, which can take up to 20 minutes to do correctly, PLEASE program it so that if the user hits Backspace when not actually clicked into a form entry field, it doesn't just happily do the equivalent of hitting the Back button. I.e. ask me if I really want to navigate away from the form I just spent 20 minutes filling in and let me click "no" so I don't scream "NO!" and want to poke out my own eye.

Sincerely,
A Nearly One-Eyed Ranting Nerd

 

P.S. Feh.