User Tools

Site Tools


dslab:status

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
dslab:status [2010/09/20 16:53] wedgedslab:status [2011/08/09 17:26] (current) – [TODO] wag2
Line 1: Line 1:
 +DSLAB STATUS updates
 +======TODO======
  
 +  * <del>get a dokuwiki page set up for DSLAB updates</del>
 +  * correct syntax in [[dslab/cluster_setup_guide|DSLAB Cluster Setup Guide]] and [[dslab/old_wiki_page|Old DSLAB main wiki page]]
 +  * update information in [[dslab/cluster_setup_guide|DSLAB Cluster Setup Guide]] and [[dslab/old_wiki_page|Old DSLAB main wiki page]]
 +  * <del>create wiki page for video wall MPI configuration</del>
 +  * take over the world, 0 or more times
 +  * is there a way to do signatures in dokuwiki? I think there is, I forget what it is at the moment...
 +  * migrate the [[http://ds.geneseo.edu/mediawiki|DSLab wiki's]] content to this wiki
 +  * enact host sharing across BITS network
 +======URLs======
 +
 +Some links of interest:
 +
 +  * [[http://www.youtube.com/watch?v=ggB33d0BLcY&feature=player_embedded#|laddergoat]]
 +  * [[http://www.ds.geneseo.edu|DSLAB web site]]
 +  * [[http://ceph.newdream.net/|CEPH fs web site]]
 +======Other Days======
 +**Other Days** is here as a sort of staging area for new posts.
 +
 +Simply click the **edit** button off to the right of this section, and at the bottom of this text entry window (ie below this text), put in:
 +<code>
 +======Month Day(th), Year======
 +</code>
 +
 +and begin an entry for that day. If you'd like to sub-section it, great.
 +
 +There's the edit button, this one off to the right ->
 +
 +======June 1st, 2011======
 +
 +=====de-dusting=====
 +I brought the dataVAC up from the LAIR, we did some serious dusting of machines... And there was a lot of dust.
 +
 +=====tweedledee rebuilt=====
 +Tweedledee was rebuilt using squeeze, and set up to serve Xen VMs as before. auth1, irc, and www were moved over to it and are now running exclusively on tweedledee. Tweedledum should be getting similar treatment in the next couple of weeks.
 +
 +=====mirror distro list updated=====
 +I showed Herb how to care and feed for the repository... He configured it to grab the latest two releases of Ubuntu, and may also set it up to snatch Gentoo for experimentation.
 +
 +=====tmphost VM created=====
 +I showed Stephen how to create a Xen VM (configuring appropriate DHCP and DNS entries on juicebar), and to launch and log into his VM.
 +
 +I provided a list of useful links from the wiki of documentation that is useful in setting up VM servers and VMs. Links are in the bookmark bar of firefox on the athens0 workstation.
 +
 +=====CEPH=====
 +I (herb) got CEPH up and running on tiger1 and tiger2. More details at [[dslab:ceph|June 1]]
 +Currently the CEPH data is mounted on tiger1 in /ceph.
 +
 +======May 26th, 2011======
 +=====sparta04 back=====
 +The juicebar network configuration files were compared against the actual computers up and running on the network. A computer, jesus.dslab.lan, was defined in dhcp.conf that doesn't actually exist. It was commented out. If nothing breaks, it will be deleted in the near future.
 +The DNS defined several computers that no longer exist: aquarius, telstar, shiznit, nsr1, and mbw6. They were also commented out pending deletion.
 +
 +======May 25th, 2011======
 +=====sparta04 back=====
 +Upon investigating the accessibility issues with sparta04, it was discovered that the problem was not with sparta04 or it's hardware, but the port on the switch. We have relocated it to another port and connectivity seems to be okay (it was spewing link up/down messages to the console when we found it this morning).
 +
 +So, port 7 on the top DSLAB switch is believed bad.
 +
 +=====DSLAB wireless=====
 +We also resurrected the DSLAB wireless AP (SSID 'ufo', configured not to broadcast), and configured it to be a pass-through to juicebar - it does not DHCP of its own.
 +
 +Information to establish this functionality was found on the following URLs:
 +
 +  * http://www.dd-wrt.com/wiki/index.php/Wireless_Access_Point
 +
 +We may also have had to bridge the WLAN port onto the existing VLAN (not necessarily documented in above URL).
 +
 +=====DSLAB dynamic DHCP range moved=====
 +Some overlap in assigned IPs was discovered, so the dynamic DHCP range in the DSLAB was reassigned to 10.81.1.180-10.81.1.199 ... No static DHCP assignments should be made in this range.
 +======May 18th, 2011======
 +
 +=====sparta04/node04 MIA again=====
 +After Herb was able to restore sparta04 and all was good, I discover today that it is once again MIA.
 +
 +Attempts to reach it over the network were unsuccessful; furthermore, no cluster activity had taken place (the Physics group had not been on, and there was no obvious evidence of a job being run)... so this makes me think we could be dealing with some other issue more local to sparta04, perhaps even hardware-oriented.
 +
 +=====node04 removed from cluster config=====
 +Until we resolve the sparta04 issues, I have removed node04 from the node pool utilized by the cluster. So any cluster users will be able to continue to use the cluster (just with 7 nodes).
 +======May 17th, 2011======
 +=====sparta04/node04 back=====
 +Herb spent some quality time with sparta04 and it began cooperating after a reboot.
 +
 +After remounting the NFS share from data, node04 was started and sanity restored to the universe.
 +======September 27th, 2010======
 +=====DYNA license updated on cluster=====
 +James McLean sent me an updated license file (LSTC_FILE) for use on the DSLAB cluster in his armor dynamics simulations.
 +
 +I have installed it (both as /export/data/global/etc/LSTC_FILE-2010 AND /export/data/global/etc/LSTC_FILE) on data.dslab.lan.
 +
 +To ensure that all changes take effect, I have rebooted the cluster.
 +
 +======September 20th, 2010======
 +=====CVE-2010-3081=====
 +There's a local root exploit targeting 64-bit Linux systems. A kernel update (for Lenny-based Debian systems) and a reboot will fix the problem.
 +
 +I have updated and rebooted both sparta00 and node00, so the first node of the cluster is set. The remaining cluster machines still need some attention.
 +
 +I put instructions on how to do this on the DSLAB wiki page.
 +
 +======September 15th, 2010======
 +
 +=====DSLAB wiki=====
 +Created the DSLAB wiki space, currently residing in the LAIR on the main Lab46 wiki. Anything under **/dslab** will be DSLAB specific.
 +
 +In time, when I get things set up the way I want, this whole section will move to the new wiki, and host just LAIR/DSLAB specific information. Until then, we at least get to use the technology (dokuwiki) that will be utilized.
 +
 +=====video wall MPI documentation=====
 +Created a tutorial for converting the video wall machines into an MPI cluster.
 +=====node132=====
 +In the process of writing the video wall MPI tutorial, I used node132 as the test machine. As such, it is fully updated and already accessible via LDAP.
 +
 +(mth)
 +
 +======September 14th, 2010======
 +The day before... which predates the DSLAB wiki. I include it so you can see the pattern of the format that is used for the status page here in the LAIR. Is it worth maintaining for the DSLAB? It would make it easier for me :)