User Tools

Site Tools


haas:status:status_201110

<html><center></html>STATUS updates<html></center></html>

TODO

  • the formular plugin is giving me errors, need to figure this out (email assignment form)
  • update grade not-z scripts to handle/be aware of winter terms
  • update system page for (new)www
  • redo DSLAB tweedledee/tweedledum with squeeze rebuilt tweedledee, tweedledum on the way
    • rebuild DSLAB www, irc, auth as squeeze VMs
  • load balance/replicate www/wiki content between LAIR and DSLAB
  • adapt LAIR irc and lab46 to use self-contained kernels (how to do this?)
  • update system page for db
  • migrate nfs1/nfs2 system page to current wiki
  • update nfs1/nfs2 to squeeze
  • flake* multiseat page

URLs

Other Days

October 15th, 2011

wheezy VM creation on cobras

I was playing around and wanted to spin up a wheezy VM… cobras xen-tools scripts couldn't do it by default, so I adapted it.

Pertinent files include: /usr/share/debootstrap/scripts/ … make a symlink to sid called “wheezy” (same was done for squeeze)

Also: /usr/lib/xen-tools/ … make a symlink to debian.d called “wheezy”

There was an entry in /usr/lib/xen-tools/debian.d/20-setup-apt that referenced etch, lenny, and sid… I added squeeze and wheezy to it (it was a case statement).

At this point xen-create-image can create wheezy VMs.

It will set it up to use the same kernel and initd as the host (the lenny kernel and initrd).. so the proper kernel and modules will need to be installed.

October 13th, 2011

nfs2 cron jobs

One nice thing about changing peer status is it gives an opportunity to see what local changes have been made on the previous master that have not been setup to seamlessly work on the other when it becomes master.

My homedirbackup.sh script was one such example… nfs2 complained it couldn't run it last night, and I got no such confirming e-mail from nfs1 that it had… sure enough, it was local to nfs2… this has been resolved, a separate cron script symlinked from /export/lib, and cron restarted on both NFSes.

RIP Dennis Ritchie

I learned this morning that Dennis Ritchie passed away yesterday; two tech titans in the span of a week.

October 12th, 2011

nfs2 acting weird

I woke up this morning to discover nfs2 acting a little weird; lrrdnode seemed to have kicked it, and while I was able to log into the machine and access files just fine, running commands like 'w', 'top', and 'ps' were all very slow to respond. No noticeable load on the machine either… no errors in dmesg. Just weird.

nfs1 hardware servicing

Since some maintenance was in order for nfs2, I needed to make nfs1 the master peer. Before I did this, I tried rebooting it to ensure the recent kernel updates were running… when nfs1 did not come back up, I went to the LAIR to investigate.

Turns out the PCI-E graphics card had gone bad, so I couldn't use the KVM console. After replacing it with a generic Nvidia PCI card, I realized this machine had its grub configuration wrong, trying to get root off of /dev/hdb1 instead of /dev/hda1… so I corrected it (eventually remembered to actually update it in /boot/grub/menu.lst).

But, it also appears we may have a resource conflict or a bad PCI-E x1 slot… as the Intel PRO NIC (the PCI-E x1 NICs we have servicing the backbone connection between the NFSes) would not come up. The system wouldn't even recognize the device as present.

So I changed slots on the graphics card (initially put it in the PCI-X, then moved it to the PCI slot)… no change.

Leaving the graphics card in the PCI slot, I moved the Intel NIC to another PCI-E slot (a x16)… and it came back up without a fight.

So while I'm still not entirely convinced it is the slot (but would not be surprised if it is), I make note of that here.

nfs1 now nfs master

As a result of debugging nfs2, I decided to kick the master status back to nfs1… this afforded me an opportunity to test my switchover logic, which had a few bugs in it (notably, the aliased interfaces would never have come up, making the switchover not appear to happen for the clients).

nfs2 maintenance

nfs2's graphics card was also bad… so I replaced it with an ATI Rage 128 (the forger of dreams/leverager of solutions).

I also removed the realtek card that was servicing the offbyone.lan segment and nfs2 is now using both its onboard NICs, just like NFS1.

After the reboot nfs2 appears to be much happier.

nfs1/nfs2 upgrade notes

Both nfses are still running IDE drives… I've got 2 SSDs waiting installation.

Both are still running Lenny. I may try to upgrade nfs2 to squeeze at some point, see if it cooperates with DRBD, and on the next switchover, upgrade nfs1 similarly.

But for now… NFS is happy.

some VMs rebooted

Due to the lengthy switchover, I ended up having to reboot some VMs as they were freaking out… mail and www, and even irc. I was able to avoid a reboot on lab46.

October 11th, 2011

new DSLAB web server: www1

I finally got around to creating a new web server for the DSLAB, named www1, and have copied the DSLAB site and transitioned the DSLAB portion of the wiki over to it.

I've also rsync'ed some datastores from lab46 to it, and it to lab46, as a start to having data redundancy.

For part 1, I am working to get all hosted sites replicated to the other, with the master site having read-write access, and the slave site having only read access.

Once this is in place, and I feel like going to part 2, I'll deploy something like csync2, so both sites can act as masters for read-write (and we can implement location-independent stuff like “www”… no matter where you are, “www” will get you to the web service).

I have modeled www1.dslab.lan like www.offbyone.lan, so it won't be as messy when we get around to further integrating the datasets.

updates to manage script

Tweaked some more on the manage script for c107a-fs… made the data reporting option even cooler (with the help of a backtitle), and discovered there's a logic error in manual user creation— cancel doesn't work.

While I haven't resolved that logic error yet (thinking I should refactor the code into functions, that'll make it a lot easier), it is quite a nice little script as it is, and should do the job for the task at hand.

more updates

Following on the update adventures from yesterday, I've applied updates to sokraits and auth2. At this point, the bulk of the core systems has seen updates applied.

LDAP fixing

As the updates broke the consumer code on the slave LDAP servers (auth3– I've yet to update auth2)… I set about rebuilding the necessary packages from source on auth3.

Once auth3 is back up and running, I will apply the updates to auth2.

The patching process went without incident, and updated packages now available. I have placed them in /root on auth2 and auth3.

October 10th, 2011

c107a-fs manage script

Continued working on the resource management script, integrating some enhancements I made over the weekend.

I've added manage to my subversion repository, and made just that manage/ directory available anonymously so that c107a-fs can check it out, perhaps as part of a daily cron job when it boots up.

sokraits/halfadder rebooted

While doing an update on lab46, apparently something unhappy took place on sokraits or halfadder, and both rebooted.

This seems to be unrelated to the previous problems (I want to say DRBD was involved this time around).

updates

With a new point release of squeeze out, I took some time to roll out some updates to various machines.

The following machines saw updates: lab46, irc, www, mail, log (none), grrasp, auth1, gnu, nfs2, nfs1, yourmambas, saiga, laddergoat, impala, gazelle, wildgoat, wildebai, halfadder, ahhcobras, antelope, auth3

broken ldap on auth3

the package update apparently replaced slapd and therefore broke it (as it was a custom build version to begin with). I'll need to investigate.

ldap on auth1 still appears to be functional.

October 9th, 2011

luakit-20110722

I installed the latest “stable”? release of luakit.. some fun was had there, as I was experiencing input problems in GNOME while using the Ubuntu installed version (20101225).

Compiled with luajit… installed the following packages:

  • libluajit-5.1-2 libluajit-5.1-dev luajit
  • libsqlite3-dev lua5.1 help2man liblua5.1-filesystem0

Compiled as follows:

machine:~/src/luakit$ make USE_LUAJIT=1
...
machine:~/src/luakit$ sudo LUA_PKG_NAME=luajit make install

October 8th, 2011

whiptail menu pro-tip

When using menus with whiptail, do not specify the –scrolltext option.. this causes the menu to act weird, preventing normal access at first.

Leaving this out the menu will still auto-scroll if content exceeds the menu window size.

compiz auto-maximize annoyance

With Ubuntu 11.04, when you move a window near the top of the screen, it will auto-maximize.

This is annoying. After some searching, I found out to how to disable this.

First, install compizconfig-settings-manager from the package repository.

Then, go to System → Preferences → Compiz Config Settings Manager

Find “Window Management”

Locate the “Grid” plugin, and disable it. Problem solved.

October 4th, 2011

reinstalled LAIRstation1/2

After some usability problems were reported on LAIRstation 2, I set about getting it and LAIRstation 1 reinstalled… this time with Ubuntu 11.04 (from 10.10).

I finally documented the process, as there are some Ubuntu-specific configuration changes needed to get LDAP up and running, as well as configuring the login screen.

System documentation is located here

In time, I'll get to LAIRstations 3 and 4, and the documentation page still needs updating with the login screen and monitor specific configurations.

flake02 pod problems

It seems that the Atom box may not have been the problem at the flake02 pod… a flakey PS/2 mouse seemed to manifest itself as the big problem, and then through further observation it may appear that mixtures of PS/2 and USB are also a source of problems… flake02 has been entirely switched to PS/2 mice.. unfortunately flake03 has one head without an adapter, and I think it does manifest problems from time to time… good to know this may be a cause.

printf %c hack

It turns out the command-line printf(1) command doesn't process %c as it does in C.

A righteous hack was undertaken, to yield the following:

lab46:~$ num=66; echo -e "\x`printf \"%x\" $num`"
B
lab46:~$ 

Further analysis (performed by squirrel) revealed that printf, as part of the GNU coreutils, does the following with respect to %c in printf(1):

printf(format, arg) // arg being a string straight out of argv[]

If desired, a fix to restore expected operation (and preserve existing operation) can be had by the following:

printf(format, atoi(arg)?atoi(arg):*arg)

Power haxx

The kill-a-watt was plugged into the LAIRwest server rack to commence a weeks worth of metrics collecting. Looks like we're so far averaging $1.31 a day (based on 11 cents a kWh).

c107a-fs

Classes have begun utilizing c107a-fs… we appeared to have transacted over 1GB in data… no complaints thus far.

c107a-fs system documentation page should now be up-to-date with all the post-deployment tweaks I put in.

October 1st, 2011

Opus processing

I ended up writing a script to assist me with processing the Opus pages, since the first checkpoint for evaluation has come to pass.

opuschk assists me with producing the results.opus files in the student data directories.

deref

To enable its functionality, I had to write a special 'deref' function in gn to help me dereference from the class lists which classes a particular student was enrolled in, that code is as follows:

244 ##
245 ## function:deref    - display courses student is enrolled in                         
246 ##      $1 - (student name)
247 ##
248 ## Errors: on non-existent student, return empty string
249 ##         on no argument, return empty string
250 ##
251 function deref {
252     if [ -n "$VERBOSE" ]; then
253         echo "$*"
254     fi
255 
256     if [ "$1" = "" ]; then
257         echo -n
258     else
259         grep -l "\<$1\>" $DPATH/class.${SEMESTER}* | cut -d'.' -f3
260     fi
261 }

I've started writing a student interface called 'results', which will parse the student's own data directory to derive important information from the various results.* files… not quite done as of yet. First release will only support the results.opus file.

All in all a productive day.

<html><center></html>

<html></center></html>

haas/status/status_201110.txt · Last modified: 2011/11/01 01:12 by 127.0.0.1