My standing policy on lab46 account persistence has been that accounts will remain so long as:
Fairly simple, but with the increasing capabilities of Lab46 over the years, especially as it has grown in the LAIR, has seen some complications become introduced:
The idea of a comprehensive stale user grooming script had snags back then, but even more now.
As I find myself debugging the newly establishing mail functionality, I figured it would be easier if I didn't waste my efforts on accounts that would never likely be used again. So I set about seeing what I could about trimming down the number of accounts I'd have to deal with.
My first script focused on the following criteria:
That script is as follows:
#!/bin/bash # for user in `/bin/ls -1`; do if [ -e $user/real.bash_profile ] && [ `groups $user | grep lab46 | wc -l` -eq 1 ]; then loginchk="[ `lastlog | grep $user | cut -c 44-73 2>/dev/null` ]" htmlout="`ls -ld $user/public_html | sed -e \"s/drwx.*$user lab46 [1-9][0-9]* \(20[0-9][0-9]-[0-9][0-9]-[0-9][0-9] [0-9][0-9]:[0-9][0-9]\) $user\//\1 /\"`" printf "%10s ... " $user echo "$loginchk $htmlout @$user" fi done
The problem, I discovered, is that some users (and even very much still active users) STILL possessed their ~/real.bash_profile files. They must have interrupted the script when it prompted for a password change, never finishing the process and thereby deleting that file.
At any rate, this would produce output, and I tweaked it favorably by doing the following when I ran it:
lab46:/home$ sudo /path/to/myscript | grep '200[0-7]' | cut -d'@' -f2 > /tmp/userdelcandidates
This would output any user that matched data (either within lastlog output OR public_html timestamp) falling in those range of years. Conveniently, it worked out quite nicely; certainly room for error, but upon personally analyzing the list of users, I didn't sense any that were in fact recent, so I went ahead with the process of removing them.
The following script was used to back them up (on both nfs and mail), preserving permissions, so that if an error was made, they could at least be restored relatively easily, without any loss of data.
cd /home # basically, cd to where the data is for user in `cat /tmp/userdelcandidates`; do echo "$user ..." tar cpf /tmp/userbackup/$user-20100105.tar $user gzip -9 /tmp/userbackup/$user-20100105.tar rm -rf $user done
In all, 59 users were able to be scrubbed as a result of this check.
I realized through some variations of the first script that there are some users who have logged in, but their accounts have fallen into disuse. Also, their accounts were created before public_html was automatically created on account login (instead, users were expected to create it if they desired to use it… how times have changed).
For once-active logging in lab46 users, there IS a file that will hold a nice timestamp of their effective last login.
Let's see who this might affect:
lab46:/home$ sudo /bin/ls -l */.bash_history | grep '200[0-7]' | cut -c14-24 > /root/userdelcandidates2
This would not be the final list.. I'd have to cross-check it with some other factors (public_html, or heck, even every other file in the directory for timestamps).
But as it is, that action alone tagged 107 users. Scanning the list, I do not see any false positives, and likely could just go ahead and backup/remove those listed (but I won't, out of the sake of completeness).
Processing users that have .bash_history history within the designated “stale” range, I came up with the following, which checks out ALL files within that particular user's home directory for any sign of modern (read: 2009 or 2010) activity:
for user in `cat /root/userdelcandidates2`; do echo -n "$user:" /bin/ls -l $user/* $user/.[a-zA-z]* | sed -e 's/^.......... .*lab46 *[1-9][0-9]* //g' | \ egrep '(2009|2010)' | grep -v pine | grep -v '2009-04-07' | wc -l done > /root/userdelcandidates3
Which then allows me to do this:
for user in `cat /tmp/userdelcandidates3`; do tar cpf /tmp/userbackup/$user-20100105.tar $user echo "$user ..." gzip -9 /tmp/userbackup/$user-20100105.tar rm -rf $user done
106 users with that run (59+106 = 165 down, with 200 as my goal).
Although there were 2 false positives:
So it sounds like I need to incorporate a clean separation of file names and sizes from time stamps. This would eliminate those above-encountered false positives nicely.
As I considered the previous scripts, something occurred to me– be they a shell user or web-only user, the FILES hold the truth.
As I discovered with the .bash_history exploration above, I ended up scanning all files in the directory and then scoring them based upon how many times 2009 or 2010 took place (based on an original .bash_history reporting of 2000-2007, so there was the possibility that some users who last used the system in 2008 could have been unintentionally snagged… whoops. I didn't see anyone that stuck out though, and I DID back up their data, so not a big deal in the long run).
What if we were to use the following script, in general (a lot more processor intensive, but then it makes it seem like lots of important work is being done):
for user in `/bin/ls -1`; do echo -n "$user:" /bin/ls -l $user/* $user/.[a-zA-z]* | grep -v 'pine' | \ sed -e "s/^.*$user *[a-zA-Z0-9][a-zA-Z0-9]* *[0-9][0-9]* *\(20[0-9][0-9]-[0-9][0-9]-[0-9][0-9] *[0-9][0-9]:[0-9][0-9]\) *.*$/\1/" | \ egrep '(2008|2009|2010)' | grep -v '2009-04-07' | wc -l done | grep ':0$' > /tmp/userdelcandidates
Bam! Looks like that problem is solved. As it turns out, there was only one additional user that popped up, so the grand total ends up being at 166… not quite the 200 I thought I'd hit… but still, that's 166 less accounts to deal with.