Differences

This shows you the differences between two versions of the page.

--- haas:spring2018:unix:projects:gtf0 [2018/03/12 17:34] – [Process] wedge
+++ haas:spring2018:unix:projects:gtf0 [2018/04/09 19:39] (current) – [plotting a single line] wedge
@@ Line 4: / Line 4: @@
 </WRAP>
-======Project: GAUGING TASK FILES (gtf0)======
+======Project: GRAPHING TREND FIGURES (gtf0)======
 =====Errata=====
@@ Line 11: / Line 11: @@
 =====Objective=====
-Working with familiar data (from having created it a few projects ago), we will now pursue avenues of compliance with TASK file specifications, performing general analysis, comparing across semesters, and other metrics.
+Recently, you spent some quality time with your raw class status data and writing a script to scrape, process, and output meaningful results.
+Here we will be taking that to the next step, in appealing to our more visual tendencies: you will be writing a script and coordinating the various tools necessary to graph your project results against the class high, average, median, and low scores for each project (effectively, a graph plotting 5 different trend lines).
 =====Background=====
-As is often the case, tasks we are given are not only meant to be accomplished, but also verified.
+Visualization has a number of uses, not only in computing, but in general: our minds are visual engines; we have phrases like "a picture is worth a thousand words", and there is a considerable amount of truth to that. We can only process so much discrete data at any given moment, yet when there exist instances where we need to process considerably more data than we can take in, we turn to things like visual representations of the data.
-In the **upf0** project, we explored and created various **task#.cli** files which were to be viable solutions to problems using the **numbers** and **pipemath** tools within stated constraints.
+By eliminating the exact, and potentially numerous discrete numeric values that would be impossible to keep track of (and formulate proper analyses of), by visually representing the data, both the general sense of the values are preserved, without overwhelming us, and allowing to take in a much broader picture that may be more challenging to do if all we had were an endless stream of numbers to evaluate.
-We explored automating aspects of the process in the **upf0steps** file.
+This project has us taking that step, taking our data we now have experience in gathering, and plotting it against various class benchmarks, so we can better gauge our overall progress in the class.
-Now, we will specifically be looking at logic to verify that the solutions generated comply with the stated requirements of each problem (ie reference the TASK file and confirm the solution falls within acceptable bounds).
+=====Plotting with gnuplot=====
+For this project, we will be making use of the venerable **gnuplot** tool. Like many powerful tools we have encountered this semester, we seek only to scratch the surface, and start to familiarize ourselves with the powerful capabilities this resource offers us.
-What's more, comparing one solution to others, even comparing one class to another are important skills in data analysis, allowing us to better realize patterns, trends, and other behaviors of the data (and users providing it).
+Following will be some usage examples to help you get a feel for how to use the tool (barely scratching the surface of what it can do):
-=====The data=====
+====plotting a single line====
-In the **gtf0/** sub-directory of the UNIX Public Directory are a number of sub-directories named after various semesters; for example, you may see things like:
-  * spring2017/
+===the data===
-  * fall2017/
+<code>
-  * spring2018/
+.0 430
+.5 120
+.0 431
+.5 600
+.6 610
+.9 620
+.0 432
+.0 500
+.0 510
+.5 900
+</code>
-Inside each of these directories is a set of additional directories, named numerically (ranging from 0 to some value, likely approaching 20).
+===the gnuplot file===
+<code>
+set title 'Line'
+set xlabel 'x'
+set ylabel 'y (1/100)'
-And inside THAT directory you will potentially find all or some of the following:
+set terminal png size 600,400
-  * TASK
+unset key
-  * task0.cli
+set tics out nomirror
-  * task1.cli
+set border 3 front linetype black linewidth 1.0 dashtype solid
-  * task2.cli
-  * task3.cli
-  * task4.cli
-  * task5.cli
-  * task6.cli
-  * task7.cli
-I say //some//, because it is possible some directories will lack the task#.cli files (denoting an incomplete or non-existent submission).
+set xrange [1:5]
+set xtics 1, .5, 5
+set mxtics 1
-You will be referencing the specific TASK file in each directory, to calibrate any logic used to evaluate the results.
+set style line 1 linecolor rgb '#0060ad' linetype 1 linewidth 3
-NOTE: You do **NOT** want to copy this data. Reference it. This would be an __excellent__ application of variables.
+plot 'line.data' using 1:($2/100) with lines linestyle 1 title 'data'
+</code>
-=====Process=====
+===generating the graph===
-In the particular **TASK** file, there are a set of 8 tasks (ranging from 0 to 7) the provide specifications of command-lines that were to have been constructed to produce the desired outcome via particular means.
-There are fields such as:
+<cli>
+lab46:~/src/gtf0$ gnuplot line.gp > ~/public_html/gtf0/line.png
+lab46:~/src/gtf0$ chmod 0604 ~/public_html/gtf0/line.png
+lab46:~/src/gtf0$ # view line.png in web browser
+</cli>
-  * **result:** the final output desired from running the command-line contained in that task#.cli file.
+===the graph===
-  * **numbers:** specifications on which of the **numbers** tools can be used in the solving of the problem.
-  * **operations:** specifications on which of the **pipemath** tools can be used in the solving of the problem.
-  * **min_pipes:** indicates the __minimum__ number of pipes that MUST be present on the command-line to qualify as a valid solution.
-  * **max_pipes:** indicates the __maximum__ number of pipes that MUST be present on the command-line to qualify as a valid solution.
-All of these factors will need to be taken into account when determining the correctness and viability of a particular solution (and then to compare one solution's viability against that of others).
+{{  http://lab46.g7n.org/~wedge/line.png  |line graph}}
-As was the case in **upf0**, the potential constraints are as follows:
+====plotting lines====
-  * **ANY:** no restrictions, any in applicable category can be used
+===the data===
-  * **ONLY:** you are restricted to only those listed
+<code>
-  * **WITH_LIMITS:** usually providing specific restrictions within an **ANY** domain
+.0 430 110
-  * **EXCEPT:** you are explicitly not allowed to use the listed; usually restricting an existing **ANY** domain
+.5 120 125
+.0 431 130
-There may also be quantity limits on how many times you can use each number or operation. If so, such will be shown in parenthesis following the item in question.
+.5 600 150
+.6 610 160
-As an example, we could have the following (formatted is it would appear in your **TASK** file):
+.9 620 192
+.0 432 100
+.0 500 340
+.0 510 450
+.5 900 700
+</code>
+===the gnuplot file===
 <code>
-task: 0
+set title 'Lines'
-result: 4
+set xlabel 'x'
-numbers: ONLY(three(2), five, seven, nine)
+set ylabel 'y'
-operations: ANY
+set terminal png size 600,400
-min_pipes: 2
-max_pipes: ANY
-</code>
+set grid
+set key below center horizontal noreverse enhanced autotitle box dashtype solid
+set tics out nomirror
+set border 3 front linetype black linewidth 1.0 dashtype solid
-With these constraints, we can set about verifying the provided solutions, checking whether the results generated, tools used, and pipes present conform with those stated requirements.
+set xrange [0.9:5.7]
+set xtics 1, .5, 6
+set mxtics 1
-====Example====
+set style line 1 linewidth 4
-For example, let's say we have 2 solutions to the above-listed task 0.
+set style line 2 linewidth 1
+set style line 3 linewidth 2
+plot 'lines.dat' using 1:2 with lines linestyle 1 title 'line1', \
+     '' using 1:3 with lines linestyle 2 title 'line2', \
+     '' using 1:($2+$3) with lines linestyle 3 title 'sum'
+</code>
-They are as follows:
+===generating the graph===
 <cli>
-$ cat 0/task0.cli
+lab46:~/src/gtf0$ gnuplot lines.gp > ~/public_html/gtf0/lines.png
-seven | minus `three`
+lab46:~/src/gtf0$ chmod 0604 ~/public_html/gtf0/lines.png
-$ cat 1/task0.cli
+lab46:~/src/gtf0$ # view lines.png in web browser
-three | minus `seven` | negate
-$
 </cli>
-===Result===
+===the graph===
-First up, do they produce the desired result? We see from the task specification (in the **TASK** file), we need to get an output of 4.
-Running the cli file (if present) will produce output, we can capture that and compare.
+{{  http://lab46.g7n.org/~wedge/lines.png  |lines graph}}
-===Numbers===
+====plotting a histogram====
-We see that we are only allowed to use the three, five, seven, and nine numbers tools in the solution (and three at max only twice). We will need to make sure that only these valid tools are used (and also that no actual numbers have been placed instead of calling upon the tools).
-===Operations===
+===the data===
-Same situation with operations... the tools allowed may differ from problem to problem.
+<code>
+march 5 55 20 30 40
+april 6 35 40 30 55
+may   7 45 50 60 70
+</code>
-===minimum pipes===
+===the gnuplot file===
-We'll see here that, if both problems are subject to that same task 0 specification, the first one is in violation- for it only has a single pipe in its command-line.
+<code>
+set title 'Histogram'
+set xlabel 'x'
+set ylabel 'y'
-We will need to perform an inventory on the pipes to make sure this requirement has been met.
+set terminal png size 600,400
-===maximum pipes===
+set grid
-Similar case here as with minimum pipes.
+set tics out nomirror
+set border 3 front linetype black linewidth 1.0 dashtype solid
-===conclusion===
+set xrange [-1:3]
-Once we have performed these tests on a problem, we can tally up and compute the result, so we can compare it to others.
+set xtics 1
-=====upf0steps=====
-You will once again be creating a steps file that can automate your project.
-As in previous projects, **upf0steps** will contain the steps you took from the point of copying the numbers suite and downloading the pipemath suite up until the submit step (hint: just run the task#.cli scripts within the steps script).
+set yrange [0:80]
-  * To clarify: YES, I want to see steps creating a project directory, copying and downloading files in question, extracting, compiling, installing, and then of course running each individual task#.cli script.
-There are some additional constraints you need to keep in mind:
+set style line 1 linecolor rgb '#0060ad' linetype 1 linewidth 2
-  * your script should not produce ANY STDERR output
+set style histogram clustered gap 1 title offset character 0, 0, 0
-  * your script should ONLY produce STDOUT output in conformance with the below stated requirements. Any other output needs to be silenced.
+set style data histograms
-  * For each task, you'll want to display things as follows:
-    * "Task X result is: #"
-      * where X is the task number (0, 1, 2, etc.)
-      * where # is the calculated output matching the TASK file result requested (ie, you must run your task#.cli script to produce this output).
-        * note that the task#.cli output appears on the SAME line as the "Task X result is:" text, and there is a single space separating it from the colon.
-  * additionally, your upf0steps file will only create, alter files if run by you. If run by a user who is not you, skip the file manipulation and only output the results.
-  * you will be making use of a loop to drive the execution of your results (the "Task # result is: ...").
-For example, a sample output of your **upf0steps** script should appear like follows (but your # values will of course be different based on your individual **TASK** file):
+set boxwidth 1.0 absolute
+set style fill solid 5.0 border -1
+plot 'histogram.data' using 2:xtic(1) title 'cactus', \
+        '' using 3 title 'maple', \
+        '' using 4 title 'willow', \
+        '' using 5 title 'birch'
+</code>
+===generating the graph===
 <cli>
-lab46:~/src/unix/upf0$ ./upf0steps
+lab46:~/src/gtf0$ gnuplot histogram.gp > ~/public_html/gtf0/histogram.png
-Task 0 result is: 13
+lab46:~/src/gtf0$ chmod 0604 ~/public_html/gtf0/histogram.png
-Task 1 result is: 27
+lab46:~/src/gtf0$ # view histogram.png in web browser
-Task 2 result is: 32
-Task 3 result is: 7
-Task 4 result is: -4
-Task 5 result is: 57
-Task 6 result is: 2
-Task 7 result is: 98
-lab46:~/src/unix/upf0$
 </cli>
-=====Submission=====
-By successfully performing this project, you should have a set of task#.cli files (one for each task). You will want to submit these, along with a **upf0steps** file.
-To submit this project to me using the **submit** tool, run the following command at your lab46 prompt:
+===the graph===
+{{  http://lab46.g7n.org/~wedge/histogram.png  |histogram graph}}
+=====Your Task=====
+Your task for this project is as follows:
+  * write a script **gtf0.sh** that when run:
+    * from the [[/haas/spring2018/unix/projects/status|class status page]], scrapes:
+      * the list of projects, the lowscore, average, median, and hiscore values of each of the evaluated projects
+        * places these values in columns (a projects column, a lowscore column, an average column, etc.) in a **gtf0.data** file
+    * from your **~/info/status/unix.projects** file in your home directory:
+      * obtains the scores and totals of each of the evaluated unix projects
+      * calculates the score (out of 100) of each individual project
+        * places these calculated scores as a final column in your **gtf0.data** file
+    * constructs a **gtf0.gp** gnuplot file that:
+      * creates a graph title of "USER SEMESTER/DESIG class status"
+        * where USER, SEMESTER, and DESIG are replaced with their pertinent (and lowercase represented) values
+          * for instance, DESIG in our case is: **unix**
+      * sets a y axis label of '**Value**'
+      * sets an x axis label of '**Project**'
+      * sets output destination to:
+        * the terminal
+        * in png format
+        * of a resolution of **1280x1024**
+      * set a y axis range of **-10** to **110**
+      * sets the y axis tic values to a value of **10**
+      * sets a grid view
+      * establishes a graph key, that:
+        * shows and identifies all 5 data points (low, avg, med, hi, your scores)
+        * places the key NOT within the main drawing area of the graph (below and in the center would be fine)
+      * plots these 5 data sets as individual lines on your graph, using the projects (in the order listed) as the x-axis tic values:
+        * each line should be a different, solid color, of a minimum thickness of 2.
+        * align each plotted category against the x axis-listed project (ie pbx1 avgscore, median, etc. line up with the pbx1 tic on the x-axis).
+        * be sure that each line is identified (titled) by its category (lowscore, avgscore, median, hiscore, yourscore), especially as identified in the graph key
+        * your line should is the same, only having a greater thickness (at least 4); this should help it stand out nicely against the rest of your graph.
+          * and, plot your line last, that will cause it to draw over any lines it will intersect with
+====End result====
+What you are aiming for is a graph that strongly resembles this one:
+{{  https://lab46.g7n.org/~wedge/content/images/spring2018.unix.attr.png  |Project Metrics graph}}
+... only it adds an additional line: YOUR actual scores on the projects.
+So this graph will be a nice visual indicator of how you did in various aspects related to the class as a whole.
+=====Spirit of project=====
+The spirit of the project embodies many aspects we've been focusing on throughout the semester:
+  * recognizing patterns to employ effective solutions in problem solving
+  * utilizing concepts and tools covered
+  * demonstrating comprehension of concepts, tools, and problems
+  * employing concepts in knowledgeable and thoughtful manner
+  * following instructions
+  * implementing to specifications
+  * utilizing creativity
+  * being able to control solution via consistent, clear, and organized presentation
+Basically: I want your solution to be the result of an honest, genuine brainstorming process where you have (on your own) figured out a path to solving the problem, you have dabbled and experimented and figured things out, and you can command the concepts and tools with a fluency enabling you to pull off such a feat. Your solution should demonstrate the real learning that took place and experience gained.
+Cutting corners, avoiding work, skimping on functionality, cheating through getting others to do work for you or finding pre-packaged "answers" on the internet violates the spirit of the project, for they betray your ability to pull off the task on your own.
+=====Submit=====
+Please submit as follows:
 <cli>
-$ submit unix upf0 upf0steps task*.cli
+lab46:~/src/unix/gtf0$ submit unix gtf0 gtf0.sh gtf0.data gtf0.gp gtf0.png http://lab46.corning-cc.edu/~username/gtf0/gtf0.png
-Submitting unix project "upf0":
+Submitting unix project "gtf0":
-    -> upf0steps(OK)
+    -> gtf0.sh(OK)
-    -> task0.cli(OK)
+    -> gtf0.data(OK)
-    -> task1.cli(OK)
+    -> gtf0.gp(OK)
-    -> task2.cli(OK)
+    -> gtf0.png(OK)
-    -> task3.cli(OK)
+    -> http://lab46.corning-cc.edu/~username/gtf0/gtf0.png
-       ...
 SUCCESSFULLY SUBMITTED
+lab46:~/src/unix/gtf0$
 </cli>
-You should get some sort of confirmation indicating successful submission  if all went according to plan. If not, check for typos and or locational mismatches.
 I'll be looking for the following:
 <code>
-:upf0:final tally of results (78/78)
+:gtf0:final tally of results (78/78)
-*:upf0:upf0steps has valid list of non-interactive instructions [4/4]
+*:gtf0:gtf0.sh directly uses info dir status data when run [4/4]
-*:upf0:upf0steps only copies/alters files if USER matches [4/4]
+*:gtf0:gtf0.sh effectively utilizes shell features [4/4]
-*:upf0:upf0steps builds the various task#.cli files it runs [4/4]
+*:gtf0:gtf0.sh is a proper bash script with shabang and exit [4/4]
-*:upf0:upf0steps obtains the latest pipemath release from site [4/4]
+*:gtf0:gtf0.sh scrapes pertinent data from class status page [4/4]
-*:upf0:upf0steps only displays specified STDOUT output [4/4]
+*:gtf0:gtf0.sh formats data and generates gtf0.data file [4/4]
-*:upf0:upf0steps resiliently creates local project directory [4/4]
+*:gtf0:gtf0.sh generates viable gtf0.gp to make intended plot [4/4]
-*:upf0:upf0steps copies public dir data with absolute path [4/4]
+*:gtf0:gtf0.sh submits correct and requested items [4/4]
-*:upf0:upf0steps makes clear, effective use of wildcards [4/4]
+*:gtf0:gtf0.sh no line in any file exceeds 80 characters in length [4/4]
-*:upf0:upf0steps has descriptive why and how comments [4/4]
+*:gtf0:gtf0.sh all custom variable name lengths at least 4 symbols [4/4]
-*:upf0:upf0steps indentation used to promote scope and clarity [2/2]
+*:gtf0:gtf0.data contents arranged by column with headings [4/4]
-*:upf0:upf0steps defines and uses custom variables [4/4]
+*:gtf0:gtf0.gp sets proper graph title and axis labels [4/4]
-*:upf0:upf0steps uses command expansions to get information [4/4]
+*:gtf0:gtf0.gp sets proper image format and resolution [4/4]
-*:upf0:upf0steps uses a loop to drive numbers in final output [4/4]
+*:gtf0:gtf0.gp sets proper axis range, sets up a grid [4/4]
-*:upf0:upf0steps automates the task when run [4/4]
+*:gtf0:gtf0.gp displays a valid key outside of graph area [4/4]
-*:upf0:all files are organized, clear, and easy to read [4/4]
+*:gtf0:gtf0.gp grabs data from gtf0.data and sets x tics from it [4/4]
-*:upf0:task#.cli files output only correct, specified data [4/4]
+*:gtf0:gtf0.gp uses different line colors and thicknesses [4/4]
-*:upf0:task#.cli files use specified number tools by quantity [4/4]
+*:gtf0:gtf0.gp identifies each line by category [4/4]
-*:upf0:task#.cli files display no STDERR output [4/4]
+*:gtf0:gtf0.sh operates according to specifications [5/5]
-*:upf0:task#.cli files have solution within given constraints [4/4]
+*:gtf0:gtf0.sh logic is organized and easy to read [5/5]
-*:upf0:task#.cli files only have the solution command-line [4/4]
 </code>
+Additionally:
+  * Solutions not abiding by spirit of project will be subject to a 25% overall deduction
+  * Solutions not utilizing descriptive why and how comments will be subject to a 25% overall deduction
+    * comments should be consistent in appearance (adopt a style; one that promotes readability)
+  * Solutions not utilizing indentation to promote scope and clarity will be subject to a 25% overall deduction
+    * indentation should be no fewer than 3 spaces (or 3-space tabs); I prefer 4.

Lab46 Wiki

User Tools

Site Tools

Differences

Page Tools