User Tools

Site Tools


haas:fall2020:unix:projects:archive_handling

Corning Community College

CSCS1730 UNIX/Linux Fundamentals

~~TOC~~

Project: ARCHIVE HANDLING

Objective

To begin putting your skills to work accomplishing tasks and solving problems on the system.

Prerequisites

To successfully accomplish/perform this project, the listed resources/experiences need to be consulted/achieved:

  • ability to read the manual pages and use the information therein
  • ability to copy, move, list, remove, and/or link files
  • ability to navigate around the filesystem

Background

When we talk about archives, there are commonly two separate actions taking place. Sometimes they are intertwined, others they represent discrete steps.

They are:

  • archiving / extracting
  • compression / decompression

Archives are merely a manifestation of a common computing concept: a container.

Containers encapsulate things; in this case- files. And the fact that UNIX tries to make everything a file really enhances the viability of this ability.

Compression, on the other hand, is an action performed on a single file. Utilizing various algorithms, we accomplish a sort of “more in less”… we can take the data present and cram it into a smaller box (file)… where the aim is to take up less storage on the filesystem (also makes copying easier).

There are many compression algorithms in existence. There are commonly two categories of compression algorithm:

  • lossless - no data is lost as a part of the compression process
  • lossy - unnecessary data is discarded as part of the compression process

Wikipedia has categories identifying various algorithms implemented for both lossless and lossy compression algorithms.

Where confusion may set in is when a tool combines the actions of archival AND compression. But if you think about it, even in such cases, we always end up with one file, and that file is compressed (unless we have a concatenation of separately compressed files into a single file.

Archives are useful in that they let us pack items together. If something needs 100 files, making a copy of that, or copying it/install it onto another system would be made more complex if we had to deal with each of those files individually. Archives simplify the problem in that they can provide us all those files, all contained within a single file (lessening opportunities for error). So, archives make our lives easier.

Procedure

In the UNIX Public Directory you will find a projects/archive_handling subdirectory.

There you will find two existing archives:

  • archive1.zip
  • archive2.tar.bz2

You'll probably want to make a copy of these to some working directory in your home directory.

Essentially, I want you to do the following:

  1. Figure out the format of each archive, and read up on the available tools for manipulating them
  2. Extract the contents of the two archives and study them (make sure you keep track of what is in which archive)
  3. Analyze the archive contents and find any corrupt or empty files. They are not needed.
  4. Arrange the remaining content by the following criteria:
    • name the files smallest, small, big, biggest, and tack on the appropriate extension
    • the smallest file will be whichever file represents the smallest/lightest of the four things (not file size but contextual content of what the file is describing)… similarly as appropriate with small, big, and biggest (the largest/heaviest of the four things)
  5. Create a new archive, called myarchive.tar containing only these size-themed files.
    • do NOT store any paths in the archive, just put the files at base level
  6. Compress myarchive.tar on second highest (on the best, not fastest end of the spectrum) compression level in gzip to create the appropriately named myarchive.tar.gz
    • also use the -n argument to aid in the verification step below
  7. Submit myarchive.tar.gz using the submit tool.

Reflection

Be sure to provide any commentary on your opus regarding realizations had and discoveries made during your pursuit of this project.

  • Why do you suppose tar works the way it does?
  • What might be some benefits of separating archival and compression functionality?

Submission

To successfully complete this project, the following criteria must be met:

  • Submit a copy of your archive to me using the submit tool.

To submit this program to me using the submit tool, run the following command at your lab46 prompt:

$ submit unix archives myarchive.tar.gz 
Submitting unix project "archives":
    -> myarchive.tar.gz(OK)

SUCCESSFULLY SUBMITTED

You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.

haas/fall2020/unix/projects/archive_handling.txt · Last modified: 2014/02/04 14:36 by 127.0.0.1