User Tools

Site Tools


haas:fall2014:unix:projects:dataproc

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
haas:fall2014:unix:projects:dataproc [2014/09/29 22:22] – [Task 5: Find and count the duplicates] wedgehaas:fall2014:unix:projects:dataproc [2014/09/29 22:28] (current) – [Task 0: Post/respond to a question] wedge
Line 18: Line 18:
  
 =====Task 0: Post/respond to a question===== =====Task 0: Post/respond to a question=====
-  * Because the class mailing list has been rather quiet of late, and we've got a break coming up, I would like each person to post at least 1 focused question regarding this project to the class mailing list. +  * To ensure adequate out-of-class communications, I'd like for you to make use of the class mailing list
-    * Please do not give away any answers to the actions requested by this project in doing so. +    * I would like each person to post at least 1 focused question regarding this project to the class mailing list. 
-    * Be sure to identify which "task" or aspect of the project you are asking about +      * This also helps to make sure everyone has subscribed to the list (as you should have the first week) 
-  * Respond to at least 1 question, not by giving an explicit answer, but by asking further questions, or giving a pointer to a resource that may contain additional information (i.e. see **cut(1)** manual page)+      * Please do not give away any answers to the actions requested by this project in doing so. 
 +      * Be sure to identify which "task" or aspect of the project you are asking about 
 +    * Respond to at least 1 question, not by giving an explicit answer, but by asking further questions, or giving a pointer to a resource that may contain additional information (i.e. see the **cut(1)** manual page)
     * To get credit, your response can**not** be to one of your own questions.     * To get credit, your response can**not** be to one of your own questions.
   * Put a URL to the mailing list post of your question asked in a file called: **task0.question**   * Put a URL to the mailing list post of your question asked in a file called: **task0.question**
Line 133: Line 135:
  
 =====Task 5: Find and count the duplicates===== =====Task 5: Find and count the duplicates=====
-  * Ignoring the index values in the left-most column, determine which numerical codes occur more than once by concocting a command-line incantation or script that appropriately filters and processes the output.+  * Ignoring the index values in the left-most column, determine which numerical codes occur more than once by concocting a command-line incantation that appropriately filters and processes the output.
   * Also display with a count of the total number of lines in the output, along with the total number of lines with valid numeric values (ignore "blank" lines and lines with error codes). Finally, display the total count of lines that have duplicates.   * Also display with a count of the total number of lines in the output, along with the total number of lines with valid numeric values (ignore "blank" lines and lines with error codes). Finally, display the total count of lines that have duplicates.
 +    * Omit all the lines that occurred only once (ie has no duplicates); it will make your data set immediately more reasonable.
   * Put your resulting command-line(s) in a file called **task5.sh**   * Put your resulting command-line(s) in a file called **task5.sh**
   * Put the output (result) of your command-line(s) in a file called **task5.out**   * Put the output (result) of your command-line(s) in a file called **task5.out**
haas/fall2014/unix/projects/dataproc.1412029342.txt.gz · Last modified: 2014/09/29 22:22 by wedge