SunOracle Solaris has Zones,
which are a light-weight virtual server platform that we use a
lot at work.
We're finally rebuilding our "chassis" servers -- the ones that host all of our virtual Solaris servers (mostly web and mail apps). Our normal zone creation script kept crapping out )at the point of first booting the new zone) with
zoneadm: zone 'ZONE': These file-systems are mounted on subdirectories of /fs/zones-1/ZONE/root: zoneadm: zone 'ZONE': /fs/zones-1/ZONE/root/var/sadm/install/.door zoneadm: zone 'ZONE': call to zoneadmd failed
(I've redacted the hostname to protect my company. :-))
The first boot of the zone happens — or rather, fails to happen — after zoneadm -z ZONE install and some file copies into the new zone directory tree.
Doing Internet searches for these strings doesn't really help much.
After many hours of digging, I discovered that I could run zoneadm -z ZONE boot and get the zone to boot — but only if I waited quite a while after doing the zoneadm -z ZONE install — at least ten minutes.
I was able to see that a pkgserv process was running:
root 23107 1 0 14:55:27 ? 0:02 pkgserv -d /fs/zones-1/ZONE/root/var/sadm/install -N pkgadd
— and that took between 3 and 4 minutes to quit out.
Now, pkgserv seems to be part of an effort to speed up building and patching zones, which can take a long time. The particular setup we're seeing appears to have shown up with the 2010-01-08 Recommended Patch Cluster (though it might have shown up earlier -- we stepped from the 2009-05-08 cluster to the 2010-01-08 cluster, so if it showed up earlier we wouldn't have seen it).
Once pkgserv finally quit, if I immediately tried zoneadm -z ZONE boot or zoneadm -z ZONE ready, it would give me the same "call to zoneadmd failed". truss showed me that a call to zone_create() was failing with EBUSY, and that was propagating up the stack. The thing that's bizarre is that it never seemed to clear. (If I left it alone, it would eventually clear [as I saw empirically] but I never actually managed to pin down how long the error would take to clear — the loop time was way too long.) I think that running zoneadm -z ZONE ready actually prolongs the error.
I finally gave up and tried
umount -f /fs/zones-1/ZONE/root/var/sadm/install/.door
(plain umount didn't work), and that magically cleared the problem. Both umount and umount -f threw errors, too:
umount: warning: /fs/zones-1/ZONE/root/var/sadm/install/.door not in mnttab umount: /fs/zones-1/ZONE/root/var/sadm/install/.door not mounted
(The door file was never showing in /etc/mnttab or in mount output. I could never find a clean way to find the mount.
I think it's clear that the pkgserv setup is a bit buggy and needs to be fixed.