User Tools

Site Tools


haas:fall2020:unix:projects:arc0

Corning Community College

CSCS1730 UNIX/Linux Fundamentals

~~TOC~~

Project: ARCHIVE HANDLING (arc0)

Objective

To begin putting your skills to work accomplishing tasks and solving problems on the system.

Prerequisites

To successfully accomplish/perform this project, the listed resources/experiences need to be consulted/achieved:

  • ability to read the manual pages and use the information therein
  • ability to copy, move, and list files
  • ability to navigate around the filesystem

Toolbox

It would be especially useful to review the manual pages or any documentation on the following resources:

  • cp(1)
  • mv(1)
  • ls(1)
  • mkdir(1)
  • tar(1)
  • xz(1)
  • gzip(1)
  • zip(1)
  • tac(1)
  • rev(1)
  • cat(1)

Background

When we talk about archives, there are commonly two separate actions taking place. Sometimes they are intertwined, others they represent discrete steps.

They are:

  • archiving / extracting
  • compression / decompression

Archives are merely a manifestation of a common computing concept: a container.

Containers encapsulate things; in this case- files. And the fact that UNIX tries to make everything a file really enhances the viability of this ability.

Compression, on the other hand, is an action performed on a single file. Utilizing various algorithms, we accomplish a sort of “more in less”… we can take the data present and cram it into a smaller box (file)… where the aim is to take up less storage on the filesystem (also makes copying easier).

There are many compression algorithms in existence. There are commonly two categories of compression algorithm:

  • lossless - no data is lost as a part of the compression process
  • lossy - unnecessary data is discarded as part of the compression process

Wikipedia has categories identifying various algorithms implemented for both lossless and lossy compression algorithms.

Where confusion may set in is when a tool combines the actions of archival AND compression. But if you think about it, even in such cases, we always end up with one file, and that file is compressed (unless we have a concatenation of separately compressed files into a single file.

Archives are useful in that they let us pack items together. If something needs 100 files, making a copy of that, or copying it/install it onto another system would be made more complex if we had to deal with each of those files individually. Archives simplify the problem in that they can provide us all those files, all contained within a single file (lessening opportunities for error). So, archives make our lives easier.

Procedure

In the UNIX Public Directory you will find a spring2017/unix/arc0/ subdirectory.

There you will find:

  • archive.tar.xz

You'll want to make a copy of this file to some project-specific working directory in your home directory (~/src/unix/projects/arc0/, perhaps?)

Essentially, I want you to do the following:

  1. Figure out the format of the archive, and read up on the available tools for manipulating it
  2. Extract the contents of the archive and study it (contents will extract to the current working directory, so you WILL want to be in a custom project directory)
  3. Analyze the filenames extracted from the archive. They are clues referring to single digit numbers.
    • figure out the number and rename each file to its referenced number. The results should be in sequential order to one another.
  4. Study the files' contents.
    • For 3/4 of the files, one or more levels of scrambling has taken place.
      • tac(1) and/or rev(1) may be of considerable help
    • When you successfully figure out how to unscramble the files, redirect the unscrambled output to its own unique file (preserving order).
      • To redirect output you'll want to use the shell's STDOUT I/O redirection (the > symbol)
    • once you have unscrambled/identified all 4 files, use cat(1) to combine all four parts into one master file called result.txt.
      • Hint: the orientation should be from “back” to “front”, with the “back” being on the left-most side, and the “front” being to the right, with expected up-down orientation (top is up, bottom is down).
  5. Create a new archive, called myarchive1.tar containing only your unscrambled pieces (not the scrambled, NOT the result.txt)
    • do NOT store any paths in the archive, just put the files at base level
  6. Compress myarchive1.tar on second highest (on the best, not fastest end of the spectrum) compression level in gzip to create the appropriately named myarchive1.tar.gz
  7. Create (And compress) another archive, called myarchive2.zip containing both the scrambled (but numerically named) files along with your result.txt file.
  8. Submit both myarchive1.tar.gz and myarchive2.zip using the submit tool.

Orientation Hints

Should you need additional clarification on my orientation hint:

  • the orientation should be from “back” to “front”, with the “back” being on the left-most side, and the “front” being to the right, with expected up-down orientation (top is up, bottom is down).

This would be an image in accordance with the desired orientation (using the international LAIR orientation calibration image):

(YES)

Which is not to be confused with this, which would NOT be conformant with this project's specified end-product orientation:

(NO)

Reflection

Be sure to provide any commentary on your journal regarding realizations had and discoveries made during your pursuit of this project.

  • Why do you suppose tar works the way it does?
  • What might be some benefits of separating archival and compression functionality?

Submission

To successfully complete this project, the following criteria must be met:

  • Submit a copy of your archives to me using the submit tool.

To submit this program to me using the submit tool, run the following command at your lab46 prompt:

$ submit unix arc0 myarchive1.tar.gz myarchive2.zip
Submitting unix project "arc0":
    -> myarchive1.tar.gz(OK)
    -> myarchive2.zip(OK)

SUCCESSFULLY SUBMITTED

You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.

haas/fall2020/unix/projects/arc0.txt · Last modified: 2017/01/17 05:43 by 127.0.0.1