Table of Contents

Corning Community College

CSCS1730 UNIX/Linux Fundamentals

~~TOC~~

Project: THE PUZZLEBOX (pbx0)

Errata

Objective

To become familiar with another useful utility, and to develop some basic debugging/diagnostic abilities.

Background

The filetype of a file can be extremely important when determining what application is used to open it.

As discussed on a few occasions already, conventional UNIX filesystems do not have any special treatment for file extensions. It is therefore said that UNIX ignores them; that means we can use more characters as valid symbols in filenames.

On systems that have engrained support for filename extensions (say, DOT followed by 3 characters), that actually limits our ability to name files.

In UNIX, it is not uncommon to find files named archive.tar.gz, and since UNIX ignores extensions, this is a perfectly legitimate filename… and it is also quite informative (we are clued in on potential actions taken on this file, as well as what tools to make use of).

Most of the time a file is named correctly, for instance a file ending in .c can be assumed to be the source code of a C program, or .gz to be a gzipped file. These are not extensions, they are traditional naming conventions (just as we tend to call those petrol-powered metal boxes with 4 rubberized wheels a “car”… we CAN call it something else, but that may cause bumps in otherwise smooth communications).

With the dircolors(1) utility colorizing specific files, it is further assuming that files which end in .mpg are really MPEG files and colors them accordingly, and the same for .zip files, etc.

In other operating systems, a file's extension determines what application is used to open the particular file. If a file that ends in .mp3 is really a .png file, the default MP3 player is going to have difficulties.

Sometimes files are not always named properly, either due to a web browser mangling an extension or for whatever reason. When a file is more than meets the eye, we must rely on the various tools available to use to determine what in fact it is.

The file utility

In UNIX there is a nifty little utility called file that attempts to determine the actual type of a file by checking a series of properties. From the file(1) man page:

There are three sets of tests, performed in this order:

  1. filesystem tests,
  2. magic number tests,
  3. and language tests.

The first test that succeeds causes the file type to be printed.

A filesystem test checks to see if the file is non-ordinary (such as a socket, symbolic link, or other special file).

The magic number test is a check of files conforming to existing fixed formats, typically by examining the file at the binary level. If using a hex editor, you will find that .gz files should always start with the same sequence of hexadecimal values.

Finally, if the file is determined to be a simple ASCII file, it will attempt to analyze whether or not it conforms to some language (ie C source code vs. an HTML document).

Note that file is not always perfect, but for most cases will get the job done. Try checking files in your home directory or elsewhere on the system and see the results. Should file not be able to give you a clear answer, you must still result to your other skills and mental faculties– test and debug the situation.

Refer to the file(1) manual page or your references for more information.

Practice

For this project, files are located in the pbx0/ subdirectory of the UNIX Public Directory.

Do the following, and discuss the results in your Opus:
a.Copy file.txt into your home directory.
b.Using file(1), what type of file does this appear to be?
c.View the contents of this file using cat(1). Is it what it appears to be?
d.Using gzip(1), compress this file with default compression. What does file(1) say?
e.Uncompress the file, and recompress using arguments for fastest (not highest) compression. What does file(1) report now?

As in many puzzles, one's visual comprehension of the scenario plays a vital role. Where something doesn't necessarily meet the eye, or is not behaving as you would expect- just try reading any messages or output. Sometimes the clues are right under your nose.

Procedure

Try your hand at the following activity, where things are not necessarily as they should be:

Being a file that ends in .html, and knowing that HTML is typically a plain ASCII text format, you might try opening it in a text editor (or simply using the cat(1)) utility.

Does it appear to be a text readable file?

As is the case many investigations, just observing how things behave can lead to recognition of an object's true state, or the recognition of a pattern, which can be used to solve the task at hand.

Submission

To successfully complete this project, you must follow the directions located in a readable file at the conclusion of this project. Until you encounter it, you are not yet finished (hint).

You should get some sort of confirmation indicating successful submission (actually, two) if all went according to plan. If not, check for typos and or locational mismatches.