<html><center></html>STATUS updates<html></center></html>
Some links of interest:
I finished my configuration of the LAIR replacement of the CCC-STUDENT-SER resource for the Mac lab on campus (C107a). A Debian Squeeze box running netatalk that provides authenticated services for per-user shares. Pretty nifty.
System documentation page is here
A student encountered this message during compile (specifically with their doubly linked list library implementation)… where things seemingly worked yesterday, no apparent changes and this happens.
Turns out the message is related to 32-bit/64-bit code… the student had unknowingly compiled code on the local LAIRstation, causing it to be 32-bit, where the code compiled on Lab46 is 64-bit by default.
Simply recompiling everything, rebuilding the library, etc. fixes the problem.
squirrel was playing with jumanji, a lightweight compliant browser with vi-like keybindings… I deployed it to the wildebeest herd this morning.
I had to install the following packages on each VM:
For class related purposes, it should do just fine (loads Lab46 content nicely). I made it the default when a user selects “web browser” from the right click menu (options for course home pages still select ice weasel…I may change this).
I decided to finally replace the ATOM box with a GX270 (so now all LAIRpods have matching hardware), as I've been suspecting potential performance issues with it for a while now.
It afforded me an opportunity to relearn how to setup multiseat on a Debian squeeze box.
For the purposes of documentation, here's how to roll out a Debian Squeeze Multiseat box:
That should be it… plug all the USB stuff in, unplug the console (make sure it is configured to boot reporting no errors), and let it go.
I tried bringing up a head on the intel onboard graphics (I temporarily unblacklisted the i915 module)… did a -query, login screen came up briefly then disappeared… input bled over from other heads… so not exactly working. Too bad, I was hoping to get a dual-display working once again.
I am going to reinstall mambas with squeeze, so it can handle some of the wildebeest VM load.
I backed up any crucial VMs (all the Plan9 stuff) to ahhcobras in preparation for wiping it.
Disk images, config files, and pertinent kernels were copied to cobras to ensure preservation of useful VMs.
I stopped down at the LAIR and reinstalled yourmambas and grrasp with squeeze, made sure ssh was installed, and deployed working Xen environments on both.
Once up and running, I copied saiga and laddergoat to grrasp, and gazelle and impala to yourmambas (disk images and config files).
So now we only have 2 wildebeest VMs running on each machine (sokraits, halfadder, yourmambas, grrasp).
We'll see if this makes halfadder any happier.
halfadder is currently running 5 VMs, and hopefully an overall lighter load (auth2, antelope, db, mail, wildgoat).
Hopefully we can go much longer than a week without any major crashes.
Around 12:45pm (possibly as early as 12:10pm)… halfadder once again threw a fit.
OCFS2 kernel module OOPSed as a result of a “BUG”… something to do with the journal.
I thought to run a fsck.ocfs2 on /dev/drbd0 when everything got back up.
We'll see if that helped the issue.
I'm also going to begin exploring offloading some of the wildebeest VMs to yourmambas and grrasp.
While debugging a beeping UPS, I accidentally cut power to koolaid and cobras.
Both machines are back up, with all pertinent VMs also back up and running.
All machines plugged into the UPS are now plugged directly into power strips.
The mail issues from the internet to Lab46 have been resolved… after the MX records were corrected for CCC DNS, capri's pf rules weren't entirely correct. The correct rule is:
## Pass SMTP traffic to mail rdr pass on $ext_if proto tcp to port $mailport -> $mail_box
Around 11:30am, climate control came back on-line! All rejoice!
To better automate bringing up all displaylink heads on the LAIRpods, I wrote a displaycheck script which I installed on all the flakes. Did some nifty backtick action.
Replaced the NIC in koolaid, as it appeared to be flakey.. this also seems to have cleaned up the LRRD script reporting for koolaid.
Apparently around 1:45pm, something bad happened on halfadder, causing it to freak out… destabilizing the intra-VMserver network connection and causing us to take everything down and reboot.
At first glance I thought it was the network interface, but that might not have been the case. At either rate, this marks the second time in under two weeks that a problem of this calibre has taken place… so it bears further scrutiny.
Perhaps we're running too much on them. I may try backing off a bit, or bringing yourmambas online to handle some of the VM load.
We moved web to the lair.lan subnet, now hosted on cobras. Some things are definitely broken (the wiki, some lrrd stuff). We'll fix it at some point.
To diagnose some weird network connectivity issues on student.lab (10.80.3.0/24), we have opted to reboot koolaid, as the link light on its re0 NIC (which serves its internal network) was not lit.
It appears to be the machine with the longest ever uptime in the LAIR:
koolaid:~$ w 4:04PM up 540 days, 1:01, 1 user, load averages: 0.51, 0.46, 0.33
<html><center></html>
<html></center></html>