User Tools

Site Tools


haas:status:status_201012

STATUS updates

TODO

  • the formular plugin is giving me errors, need to figure this out (email assignment form)
  • can I install writer2latex on wildebeest herd without needing gcj??
  • look into how to adequately set up subversion svn+ssh:// authentication
  • set up symlink and cron job for userhomedirbackup.sh on nfs2; update wiki page for nfs

URLs

Other Days

December 31st, 2010

lair-backup

I updated the lair-backup package, as I realized I should probably do a backup of auth1… and with a new month rapidly approaching, instead of doing something manual and custom… I'd just roll it in with the regular monthly backups.

In the process, I discovered that the backup scheme was a bit outdated, and some machines hadn't even backed up in months… due to the outdated information… so I brought everything up to date, and deployed lair-backup on everything— actually, lair-backup wasn't really installed on anything… they had the scripts I manually put on them almost a year ago, which is for all intents and purposes identical to what lair-backup deploys.

So this way I have a bit more automation in place.

lair-mail

In the process of deploying lair-backup, I realized I should get around to also deploying a sane mail configuration scheme on all the various LAIR machines… specifically, install nullmailer and have all mail get routed to mail.offbyone.lan.

This was not universally done… some machines have exim installed, some actually trying to handle their own mail… others have been manually setup with nullmailer and doing the right thing.

So I'm just going to take a manual and working nullmailer setup, copy its files, and have the package install script substitute the appropriate values, and we'll live happily ever after (just don't install lair-mail on mail).

bigworld

As per a request from mgough, I'm (attempting to) install a CentOS-5 virtual machine (on yourmambas) so that he can proceed with his “Big World” game engine project.

I am deploying the VM, bigworld.offbyone.lan, on yourmambas… giving it ample resources, and seeing if I can pull off a centos 5 install.

December 30th, 2010

dokuwiki ldap converted

I had forgotten to switch over the wiki to the new LDAP, and did that today.

Configuration was changed to the following:

$conf['auth']['ldap']['server'] = 'ldap://auth1:389';
$conf['auth']['ldap']['usertree'] = 'ou=people,dc=lair,dc=bits';
$conf['auth']['ldap']['grouptree'] = 'ou=groups,dc=lair,dc=bits';
$conf['auth']['ldap']['userfilter'] = '(&(uid=%{user})(objectClass=posixAccount))';
$conf['auth']['ldap']['groupfilter'] = '(&(objectClass=posixGroup)(|(gidNumber=%{gid})(memberUid=%{user})))';
$conf['auth']['ldap']['version'] = '3';

Seemed to light up pretty quickly.

auth slapd taken offline

After converting everything over to the new LDAP, I finally disabled the old LDAP server.

LDAP consumer (auth2) operational

My LDAP success continues. I was able to bring auth2 online as an LDAP consumer in the new LAIR LDAP setup… functioning replication, along with referrals and chaining… I was able to get client machines to authenticate against either, and writes to auth2 were appropriately forwarded to auth1 (pretty darn cool).

I even intentionally took down auth1 for a few minutes to see what would happen… clients, when needing LDAP, recognized the failure and kicked over to auth2 (the next LDAP server in their config file), and happily found the information they wanted.

I was able to log in, and the only failure (but expected) was the inability to change passwords, since auth2 is not equipped to do so. Pretty awesome.

Ultimately, I ran into a bug that I am amazed didn't stop me before… I need to rebuild the openldap stuff from source after applying a patch to fix a problem in the chaining functionality (as I discovered basic LDAP database accessibility was virtually non-existent, even though it was still happily running and servicing clients).

So I may end up having to rebuild auth2 and use the patched packages… we'll see. Things have been making sense, and I've been solving my way through various problems encountered.

I hope that, once this gets settled, I can successfully tackle doing referrals (if not replication+referrals) to a soon-to-be-rebuilt DSLAB LDAP server… so we can once again have universal authentication.

A long process, to be sure, but even if I only achieve my current level of success, we're a lot better off than before… as we now have 2 functionally redundant LDAP servers in the LAIR.

lair-ldap updated (1.2.2-0)

To address the functionality auth2 is now providing, along with important new knowledge learned, I have once again updated the lair-ldap package and rolled it out to the known universe.

Any lingering (ie powered off) machines that rely on LDAP should update lair-ldap to be restored to sanity.

December 28th, 2010

LDAP, LDAP, LDAP

I got auth1.offbyone.lan up and running. I was so productive and successful that I migrated all user data to it, and made it the primary LDAP server for the LAIR.

My hope is to get auth2 up and running as a replicated peer, so we'll at least have 2 LDAP resources to access.

lair-ldap has been updated to facilitate the transition (just install/upgrade it).

auth.offbyone.lan is still up and running, but only to give time to any remaining systems to switch their LDAP configurations over.

Further information can be found at auth.offbyone.lan

December 27th, 2010

DNS SRV (Service Discovery)

During my LDAP explorations, I discovered information on DNS service discovery, which apparently is real and has support for things like http and ldap. I deployed some lines for it on caprisun (/var/named/master/offbyone.lan):

; experimenting with DNS Discovery
_http._tcp.offbyone.lan.        IN      SRV     5       5       80      www.offbyone.lan.
_ldap._tcp.offbyone.lan.        IN      SRV     10      0       389     auth.offbyone.lan.

And, after loading the updated zone, I was able to test it (on caprisun):

caprisun:~# dig SRV _ldap._tcp.offbyone.lan

; <<>> DiG 9.4.2-P2 <<>> SRV _ldap._tcp.offbyone.lan
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61702
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 2

;; QUESTION SECTION:
;_ldap._tcp.offbyone.lan.       IN      SRV

;; ANSWER SECTION:
_ldap._tcp.offbyone.lan. 259200 IN      SRV     10 0 389 auth.offbyone.lan.

;; AUTHORITY SECTION:
offbyone.lan.           259200  IN      NS      ns1.offbyone.lan.

;; ADDITIONAL SECTION:
auth.offbyone.lan.      259200  IN      A       10.80.2.8
ns1.offbyone.lan.       259200  IN      A       10.80.2.1

;; Query time: 4 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Mon Dec 27 22:11:56 2010
;; MSG SIZE  rcvd: 128

caprisun:~# 

I was not able to test it from an outside system (lab46):

lab46:~$ host -t SRV _ldap_tcp.offbyone.lan
Host _ldap_tcp.offbyone.lan not found: 3(NXDOMAIN)
lab46:~$ 

So I wonder if pf needs to be twiddled with in order to allow this?

auth1/auth2

I had an urge to play with LDAP again, perhaps in an attempt to get replication working, so we can have shared authentication with DSLAB resources.

My current test efforts are on auth1 and auth2 (.offbyone.lan), both VMs on sokraits and halfadder.

Documentation available at auth.offbyone.lan.

December 24th, 2010

ircpost plugin

A few days ago I discovered the ircpost dokuwiki plugin. I finally got around to installing, configuring, and deploying it.

I modified the output of the plugin by adding the current timestamp to /var/www/lib/plugins/ircpost/action.php on www.offbyone.lan as follows:

66
         } else {
           stream_set_timeout($fp,1);
           stream_set_blocking($fp,0);
           $out = $this->getConf('username')."\n";
           $out .= $this->getConf('password')."\n";
           $message = "";
           if (strlen($SUM)>0) {
              $message = ".say ".$this->getConf('channel')." Comment: ".$SUM."\n";
           }
           $out .= ".say ".$this->getConf('channel')." [".chr(2)."wiki".chr(2)."] ".$IN    FO['client']." changed ".DOKU_URL."".$ID." (".date('YmdHi', time()).") \n";           
           $out .= $message;
           $out .= ".quit\n";
           fputs($fp, $out);

Basically, I took:

$out .= ".say ".$this->getConf('channel')." [".chr(2)."wiki".chr(2)."] ".$IN    FO['client']." changed ".DOKU_URL."".$ID." \n";  

and changed it to:

$out .= ".say ".$this->getConf('channel')." [".chr(2)."wiki".chr(2)."] ".$IN    FO['client']." changed ".DOKU_URL."".$ID." (".date('YmdHi', time()).") \n";  

Basically just called the PHP date() function and gave it the appropriate parameters for the type of datestring I desire.

lrrdnode on sokraits/halfadder

I got lrrdnode clients reinstalled and reconfigured for LRRD. They're once again reporting.

nfs tftp client config

I adjusted /export/client/etc/rc.local for the old netbooting flakes, making the following change:

#for XSERVE in antelope gnu wildebai wildgoat; do
for XSERVE in gnu; do
    HOSTCHK="`ping -q -w 1 -c 4 ${XSERVE}.offbyone.lan | grep 'received' | cut -d',' -f2 | sed 's/^.*\([0-9][0-9]*\).*$/\1/g'`"
    if [ $HOSTCHK -gt 0 ]; then
        let HOSTCNT=$HOSTCNT+1
        XRESLOT[$HOSTCNT]="$XSERVE"
    fi
done

Basically, I removed all but gnu from the host searching order. This is to preserve backwards compatibility with the old flake machines, should we still have any remaining into the future (plus just a neat thing to have to connect to).

December 23rd, 2010

sokraits and halfadder

I love it when a plan comes together. I finished my upgrade of sokraits, which resulted in finishing halfadder's transition, so both are now functional DRBD+OCFS2 peers, both able to run Xen VMs.

I moved over all our VM data, so NFS is effectively relieved of the VM stress it has endured since last spring.

What's more, live migration works! This could make ongoing maintenance (such as removing halfadder's old data drive) much much smoother, as we hopefully won't have to shut down lab46 and irc, but transparently move them between machines.

December 22nd, 2010

sokraits joins the party

I took some time this afternoon to convert sokraits over to be a DRBD+OCFS2 peer.

I updated sokraits_halfadder.offbyone.lan accordingly to include sokraits-specific information as well as new information pertaining to DRBD and OCFS2.

At this point, DRBD and OCFS2 are both operational and functioning. I have copied over halfadder's /backup data to the OCFS2 volume (31GB), and will do more data transition tomorrow.

sokraits needs to be rebooted to ensure all its settings are correct, then it can resume VM hosting operations.

December 21st, 2010

eggdrop and dokuwiki

By accident, I stumbled upon something rather neat:

A dokuwiki plugin which interfaces to an eggdrop bot and reports wiki changes on irc. This could be rather useful in some metrics collecting endeavors.

telstar disk saga

So after my initial reports of my disk failing in telstar, I went and repartitioned the drive, and ended up slapping a copy of Ubuntu 10.10 on a 36GB partition near the end of the drive. There appear to be no bad sectors here… performance is snappy, and the system responds quite ably.

Under Ubuntu, I ran what they call their “Disk Utility” and the S.M.A.R.T. Self-Test did report some bad sectors being reallocated. Neither Apple's Disk Utility nor Tech Tool Deluxe reported anything of consequence. Quite annoying.

I ended up calling AppleCare today, and spent some quality time talking to an “advisor” and then being transferred to a “senior advisor”. There was concern and I felt a desire to assist me, but I committed one bad deed– I used a “third party utility” in my diagnosis. Despite the fact that I used Apple's tools first, and the ONLY recognition of an error were Apple's own /var/log/kernel.log reporting “disk0s2: I/O error”…

So I was sent on a mission of formatting the drive in Disk Utility and reinstalling Snow Leopard to then check and see if the errors were still present.

I ultimately did this, but ended up doing some other things too:

  • reformatted in Disk Utility, chose to Zero Disk Data (more than they asked for).
  • tried then restoring the system from my Time Machine backup… it got to 4% then failed.
  • I then proceeded with installing Snow Leopard from the Install DVD. 5 HOURS LATER it finished.

Although I have yet to see and Disk I/O errors in system logs, performance is still quite unacceptable… boot-ups take minutes (the Snow Leopard “Welcome” video really wasn't… the video basically didn't play, although the audio did). Regular system usage is far from acceptable… long spinning beach ball delays opening Safari or a Finder window… in some respects the problem is far more pronounced than previously (and there's far less data on the drive).

At the same token, my Ubuntu partition was untouched. I am back there now and system performance is most acceptable.

This tells me that the interface to the drive is okay, but that portions of the drive are defective (consistent with my assertion of bad sectors). Not to say they won't eventually spread, but for now, I at least have a functioning system.

I redownloaded Tech Tool Deluxe, this time on my MacBook Pro, and am going to fire up the iMac in firewire target disk mode to test its disk there, just to give that another chance at detecting a problem (but currently even the Ubuntu “Disk Utility” tool is not finding problems). Maybe it just takes time, but I find it quite odd that there are real performance problems related to I/O when using other portions of the disk, yet here at the end of the drive, things are right as rain.

Assuming this step convinces the AppleCare folks, we then see if I have to drag my box across the countryside in order to bring it in contact with an Apple Service Center (big drawback of owning an iMac– the sucker is so big they're not that willing, and I don't blame them in the slightest, to ship it as they would a notebook). But considering I've owned my iMac for just over 2 years and it has performed flawlessly until the disk problems began around a month ago, I really have no complaints… just the inevitable inconvenience that awaits (either way it is an inconvenience…. I've been without a fully functional home machine since the weekend… I haven't bothered fully setting up things like e-mail functionality under Ubuntu due to the expected brevity in which this drive will remain in the system).

But, I must say once again… Ubuntu 10.10 is FAST. And enjoyable. I downloaded and installed Macbuntu… so it is looking quite Mac-like at the moment… not perfect, but still very usable. I haven't decided if I like Macbuntu or if I prefer the native Ubuntu desktop configuration more. Macbuntu is familiar but deceiving in small ways.

December 19th, 2010

telstar disk failing

The disk in my iMac is becoming increasingly error prone, as is witnessed by the following littering my /var/log/kernel.log:

Dec 19 12:12:48 telstar kernel[0]: disk0s2: I/O error.
Dec 19 12:29:20 telstar kernel[0]: disk0s2: I/O error.

fstab

One of the configuration tweaks I'd like to document is how to hide volumes from being automounted in OS X.

In /etc/fstab:

UUID=8518EF81-16E9-3921-A05A-4FBC54C8B5C2   none    hfs rw,noauto

Basically, find the UUID of the particular volume you'd like to hide, and specify it as indicated in fstab. It has worked great.

December 12th, 2010

asnsubmit

There was a flaw in one of the regex in this file that I was using to detect the submission date of assignments… it relied on finding the line containing: (username)

In CS 0xC (in UNIX), the output of groups would have usernames in parenthesis, which would throw off my calculations due to the greedy nature of regular expressions.

I hardened the regex to avoid this problem.

Ideally I need to re-engineer how I detect this (ie grab the third line of the file)… but I need to make sure nothing else will break as a result.

December 6th, 2010

multiseat

Our multiseat prototype is now online! Using Ubuntu 9.04, we have a 3-seat setup using the Plugable USB 2.0 integrated hub/usb graphics setups.

Very nice looking, and the little atom box seems to initially hold up to serving 3 users. I'll be ordering more Plugable units to do a wider roll-out.

There should be a student documentation page coming… I started a page as well to document the various configuration steps that took place:

dokuwiki refnotes

Upon request, I installed the refnotes dokuwiki plugin on both wikis.

December 4th, 2010

gnuplot

missed questions

I also spent some time writing a script to parse through the UNIX assignment submission files for empty responses, for use in calculating point deductions.

I took two approaches:

  1. updated submit.php to pre-emptively scan for this condition and leave a constant value in the resulting file that can be later grepped for.
  2. wrote a script called sift which takes care of all the existing data from this semester.

I'll eventually have to adapt sift to use the new submit.php logic, but for now I'm happy with it. It looks to adequately do the job… I just need to get it to plug values in to the appropriate file (and probably also rig up command-line arguments to only do by certain weeks).

December 3rd, 2010

semester script

I've had thoughts of enhancing my semester detection script to better work with the recognition of the summer and winter semesters.

The following will extract the line from the HTML file off CCC's website which details some important dates of the Fall and Spring semesters:

lab46:~$ wget -q -O - http://www.corning-cc.edu/future/acacalendar/ | grep 'cccred' | grep h2

dokuwiki status page

Last month I fixed some logical problems in the dokuwiki status script… therein accidentally introducing some new problems… namely the calculation of some data.

This has been fixed.

The problem was:

34
let nextmonth=$curmonth+1
if [ "$nextmonth" -gt 12 ]; then
    nextmonth=1
    pyear=$curyear+1
fi
#nextmonth="`printf '%.2d' $nextmonth`"
 
let prevmonth=$curmonth-1
if [ "$prevmonth" -lt 1 ]; then
    prevmonth=12
    pyear=$curyear-1
fi

And was fixed by using the let keyword before both pyear calculations:

34
let nextmonth=$curmonth+1
if [ "$nextmonth" -gt 12 ]; then
    nextmonth=1
    let pyear=$curyear+1
fi
#nextmonth="`printf '%.2d' $nextmonth`"
 
let prevmonth=$curmonth-1
if [ "$prevmonth" -lt 1 ]; then
    prevmonth=12
    let pyear=$curyear-1
fi

<html><center></html>

<html></center></html>

haas/status/status_201012.txt · Last modified: 2011/01/01 06:12 by 127.0.0.1