user:vcordes1:portfolio:cla
Table of Contents
Purpose
The purpose here is to mine the 2012 spring class schedule in html format and extract specific classes
Necessities
- Knowledge of Regular Expressions
- Knowledge of Shell Scripting
Process
- With this I will be saving the relevant data to a file and manipulating the file via a shell script.
Things
- To get the dataz
* cat spring2012-20111103.html | grep "ddtitle" | sed 's/^<TH CLASS="ddtitle".*crn_in=.....">//g' | sed 's/<\/A.*$//g' | sed 's/^\(.*\) - \([0-9][0-9][0-9][0-9][0-9]\) - \(.*\) - \([0-9][0-9][0-9]\)$/\1: \3-\4:\2/g'
- Shell Script
#!/bin/bash echo -n "please enter a class: " read class cat combooutput1 | grep -A5 $class
Attributes
- Files and directories
- Commands
- The UNIX shell
- Regular Expressions
- Filters
- Scripting
- The UNIX development Environment
Final Thinkings
- This was relatively easy working with only the necessary data.
user/vcordes1/portfolio/cla.txt · Last modified: 2011/12/15 12:33 by vcordes1