Corning Community College
CSCS1730 UNIX/Linux Fundamentals
~~TOC~~
======Project: ARCHIVE HANDLING (arc0)======
=====Objective=====
To begin putting your skills to work accomplishing tasks and solving problems on the system.
=====Prerequisites=====
To successfully accomplish/perform this project, the listed resources/experiences need to be consulted/achieved:
* ability to read the manual pages and use the information therein
* ability to copy, move, and list files
* ability to navigate around the filesystem
=====Toolbox=====
It would be especially useful to review the manual pages or any documentation on the following resources:
* **cp**(**1**)
* **mv**(**1**)
* **ls**(**1**)
* **mkdir**(**1**)
* **tar**(**1**)
* **xz**(**1**)
* **gzip**(**1**)
* **zip**(**1**)
* **tac**(**1**)
* **rev**(**1**)
* **cat**(**1**)
=====Background=====
When we talk about archives, there are commonly two separate actions taking place. Sometimes they are intertwined, others they represent discrete steps.
They are:
* archiving / extracting
* compression / decompression
Archives are merely a manifestation of a common computing concept: a container.
Containers encapsulate things; in this case- files. And the fact that UNIX tries to make everything a file really enhances the viability of this ability.
Compression, on the other hand, is an action performed on a single file. Utilizing various algorithms, we accomplish a sort of "more in less"... we can take the data present and cram it into a smaller box (file)... where the aim is to take up less storage on the filesystem (also makes copying easier).
There are many compression algorithms in existence. There are commonly two categories of compression algorithm:
* [[http://en.wikipedia.org/wiki/Lossless_data_compression|lossless]] - no data is lost as a part of the compression process
* [[http://en.wikipedia.org/wiki/Lossy_data_compression|lossy]] - unnecessary data is discarded as part of the compression process
Wikipedia has categories identifying various algorithms implemented for both [[http://en.wikipedia.org/wiki/Category:Lossless_compression_algorithms|lossless]] and [[http://en.wikipedia.org/wiki/Category:Lossy_compression_algorithms|lossy]] compression algorithms.
Where confusion may set in is when a tool combines the actions of archival AND compression. But if you think about it, even in such cases, we always end up with one file, and that file is compressed (unless we have a concatenation of separately compressed files into a single file.
Archives are useful in that they let us pack items together. If something needs 100 files, making a copy of that, or copying it/install it onto another system would be made more complex if we had to deal with each of those files individually. Archives simplify the problem in that they can provide us all those files, all contained within a single file (lessening opportunities for error). So, archives make our lives easier.
=====Procedure=====
In the UNIX Public Directory you will find a **spring2016/unix/arc0/** subdirectory.
There you will find:
* archive.tar.xz
You'll want to make a copy of this file to some project-specific working directory in your home directory (**~/src/unix/projects/arc0/**, perhaps?)
Essentially, I want you to do the following:
- Figure out the format of the archive, and read up on the available tools for manipulating it
- Extract the contents of the archive and study it (contents will extract to the current working directory, so you WILL want to be in a custom project directory)
- Analyze the filenames extracted from the archive. They are clues referring to single digit numbers.
* figure out the number and rename each file to its referenced number. The results should be in sequential order to one another.
- Study the files' contents.
* For 3/4 of the files, one or more levels of scrambling has taken place.
* **tac**(**1**) and/or **rev**(**1**) may be of considerable help
* When you successfully figure out how to unscramble the files, redirect the unscrambled output to its own unique file (preserving order).
* To **redirect output** you'll want to use the shell's **STDOUT I/O redirection** (the **>** symbol)
* once you have unscrambled/identified all 4 files, use **cat**(**1**) to combine all four parts into one master file called **result.txt**.
* Hint: the orientation should be from "back" to "front", with the "back" being on the left-most side, and the "front" being to the right, with expected up-down orientation (top is up, bottom is down).
- Create a new archive, called **myarchive1.tar** containing only your unscrambled pieces (not the scrambled, NOT the **result.txt**)
* do NOT store any paths in the archive, just put the files at base level
- Compress **myarchive1.tar** on second highest (on the best, not fastest end of the spectrum) compression level in gzip to create the appropriately named **myarchive1.tar.gz**
- Create (And compress) another archive, called **myarchive2.zip** containing both the scrambled (but numerically named) files along with your **result.txt** file.
- Submit both **myarchive1.tar.gz** and **myarchive2.zip** using the submit tool.
=====Orientation Hints=====
Should you need additional clarification on my orientation hint:
* the orientation should be from "back" to "front", with the "back" being on the left-most side, and the "front" being to the right, with expected up-down orientation (top is up, bottom is down).
This would be an image in accordance with the desired orientation (using the international LAIR orientation calibration image):
{{ :haas:spring2015:unix:projects:mudkip.jpg |}} (YES)
Which is not to be confused with this, which would NOT be conformant with this project's specified end-product orientation:
{{ :haas:spring2015:unix:projects:perry_the_platypus_mudkip_hybrid_by_jackspade2012-d5jeedr.png?200 |}} (NO)
=====Reflection=====
Be sure to provide any commentary on your opus regarding realizations had and discoveries made during your pursuit of this project.
* Why do you suppose tar works the way it does?
* What might be some benefits of separating archival and compression functionality?
=====Submission=====
To successfully complete this project, the following criteria must be met:
* Submit a copy of your archives to me using the **submit** tool.
To submit this program to me using the **submit** tool, run the following command at your lab46 prompt:
$ submit unix arc0 myarchive1.tar.gz myarchive2.zip
Submitting unix project "arc0":
-> myarchive1.tar.gz(OK)
-> myarchive2.zip(OK)
SUCCESSFULLY SUBMITTED
You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.