User Tools

Site Tools


haas:status:status_201403

STATUS updates

TODO

  • the formular plugin is giving me errors, need to figure this out (email assignment form)
  • update grade not-z scripts to handle/be aware of winter terms
  • update system page for (new)www
  • redo DSLAB tweedledee/tweedledum with squeeze rebuilt tweedledee, tweedledum on the way
    • rebuild DSLAB www, irc, auth as squeeze VMs
  • load balance/replicate www/wiki content between LAIR and DSLAB
  • adapt LAIR irc and lab46 to use self-contained kernels (how to do this?)
  • update system page for db
  • migrate nfs1/nfs2 system page to current wiki
  • update nfs1/nfs2 to squeeze
  • flake* multiseat page

URLs

Some links of interest:

Other Days

March 24th, 2014

diskreport.sh overhaul

In what has shaped up to be a busy day updating my script infrastructure, diskreport.sh was the next script to receive some attention.

Originally written to give me daily updates on the DSLAB cluster NFS mount, I had also expanded it to give me cursory overviews of disk usage on several LAIR VMs.

However, the output was far from useful, omitting several machines (such as sokraits, halfadder, nfs1/2), which I have had to manually tend to issues that have cropped up (often once they've become an issue). This script now does a fair bit more expansive reporting, and is condensed into one main loop, vs. having essentially N blocks of code with different variables hardcoded.

I also am doing a straight call right to my “status” script on data.dslab.lan… since that is more customized for the DSLAB environment.

screen confusion

More students this semester seem to not have gotten screen. Still multiple sessions running, and clearly some lack of understanding of process management. Sadly, the super-majority are in my UNIX class.

homedirbackup, moar

Did some further tweaks to homedirbackup. The storage size-based backup frequency code seems to be working- which should lead to more automated maintenance of the OCFS2/DRBD volume on sokraits/halfadder.

OCFS2 read only issue

The free space mismatch forcing OCFS2 to switch to read only issue reared its ugly head again (as I was clearing out remnant directories from removed user accounts).

I figured the best approach was a full unmount and reboot (multiple, actually), and I was sure to run a forced fsck on BOTH sokraits and half adder (sokraits first, reboot both, halfadder second, reboot both again, mount everything, resume operations).

I line I used was:

# fsck.ocfs2 -v -y -f /dev/drbd0

I also attempted to reduce the chaos caused by rebooting lab46 and irc by doing an “xm save”… both seemed to behave, and restore without issue (I restored irc on the same machine as lab46). I forgot how powerful that functionality is… I may have to utilize it more in the future.

March 23rd, 2014

homedirbackup

Implemented some refinements to minimize output on a daily e-mail, as well as better manage disk space consumed by user home directory backups.

Attention was focused on variable backups to keep per user, based on overall disk space usage:

        ##############################################################
        ## Obtain some per-user stats and format variables
        ##
        hist=4
        userl=`echo "${user}" | sed 's/^.*\///g'`
        spaceused="`du -hs ${user} | tr '\t' ':' | cut -d':' -f1`"
        unit="`echo ${spaceused} | grep -o '.$'`"
        wholespace="`echo ${spaceused} | grep -o '^[0-9][0-9]*'`"
 
        printf "%11s" "[${userl}] "
 
        ##############################################################
        ## Set history for pruning
        ##
        if [ "${userl}" = "mail" ]; then
            hist=8
        elif [ "${unit}" = "K" ]; then
            hist=8
        elif [ "${unit}" = "M" ]; then
            if [ "${wholespace}" -ge 512 ]; then
                hist=2
            elif [ "${wholespace}" -ge 128 ]; then
                hist=3
            fi
        elif [ "${unit}" = "G" ]; then
                hist=1
        fi
haas/status/status_201403.txt · Last modified: 2014/04/01 05:12 by 127.0.0.1