Table of Contents

Reid Hensen's Unix Opus

Spring 2012

Introduction

Hello, my name is Reid Hensen. I am currently attending CCC for my final semester in the networking program and plan to eventually pursue a degree in network security at another school. I work as a tutor and a lab assistant in the networking computer labs. Before beginning work at CCC in my current program, I earned an undergraduate degree in Philosophy from University at Albany. However, the current difficult job market and my growing interest in working with computers eventually brought me to the decision to pursue an education in this field. This course is not required for my major, but I am taking it out of personal interest. From my understanding, it is common to find servers that run Unix so I believe that this is an important skill when working out in the real world. Although I am new to this, I am intrigued by the Unix community and the malleability that the operating system seems to allow for. I see Unix as an interesting jumping off point for truly learning how to use an operating system the way I want to instead of how a manufacturer decides they want you to.

Part 1

Entries

Entry 1: January 28, 2012

Today I completed the first set of assignments for the class. Due to the fact that the class is only beginning, the lab section was intended to mostly explain how to perform basic actions that are needed to work in the Unix environment. Some of the essential commands needed to start using Unix are the commands to list the files within a directory, copy a file, remove a file, and move a file. Most Unix commands can also be used with special options that provide additional functionality when using the commands. This feature brings up another important feature of Unix commands, which is how to access the manual pages for the different utilities. A manual can be viewed by entering the “man” command followed by the name of the command you are looking for information on.

The lab makes note of an interesting detail in Unix, which is that unlike DOS there is no rename command. If the move command is given a source file and a destination that is another file name that does not exist, Unix will recreate that file under this name. Although this is not the purpose of the command that is implied by its name, this effectively works the same as a rename command. For this reason, giving Unix a rename command would be redundant and would provide no additional functionality. I find it interesting that Unix seems to be designed to offer small powerful tools that can be used in creative ways to get the results that the user is looking for.

Entry 2: February 3, 2012

Today’s lab focused on the topic of file and directory management within Unix. While I am new to Unix, I am happy to see that for the most part navigating through the directory is the same as it is in DOS. I would by no means consider myself especially proficient in DOS, but I have a good understanding of how to use the directory and this knowledge applies to Unix as well. The cd command is used to change to new directories and can be used with either relative or absolute pathnames.

The notable difference that I have seen with Unix is that each file has a set of permissions that can be changed using octal or symbolic values. While this feature may be confusing at first if the concept is new, I felt that this was actually a fairly intuitive and powerful way of dealing with file permissions. Permissions are an important attribute of all files on a system and greatly affect how resources and the system itself are used. The chmod command is an easy way to assign permissions to each file and the –l option on the ls command is a good way to determine the source of the problem if a user is unable to access a file.

As per the instructions in the lab, I have made an ASCII tree that shows the location of my home directory on Lab 46 and some of the files and directories that are contained within it.

            /(root)
                |
              home
                |
             rhensen
                |
         --------------
     	  |	 |     |
      Maildir  bin   src
	  	|
	------------------
	|	  |	  |
     lab1.file   submit   unix

Entry 3: February 17, 2012

This week’s lab focused on different elements of the Unix shell, which provide insight into how to effectively work in the Unix environment. Unix makes use of hidden files, which are identified by having a dot (.) in front of the file name and can be displayed by using the –a option with the ls command. Knowing how to access hidden files can yield some useful results as is the case with the .signature file (which holds text that will be added to the end of every email you send) and the ~/.bash_history file (which contains the history of previous command the user has entered, even from previous sessions).

Other useful features that were explored were variables and command aliases. Variables are used to reference data that is in memory and are specified using the dollar sign ($) character. Aliases are a way to set a command to use certain options automatically. For example, the ls command does not automatically classify different file types by color. A special option is needed to do that, which is added automatically whenever the ls command is used because it is set in the alias.

A case study was also given which shows how to deal with files that are named in ways that make them difficult to use in Unix. For example files that contain spaces, dollar signs, or wildcard characters will have these characters interpreted as such rather than as part of the file name. This exercise was useful for showing different ways around these bad names.

Entry 4: February 24, 2012

The lab for this week further explored some fundamental concepts of the Unix shell. The major concepts covered by this lab are wildcards, I/O redirection and piping, pagers, and different uses of quotes. Wildcards are used to match characters for a command. This is especially useful with the ls command in order to display certain types of files. Wildcards can be used to match a certain number of characters, specific characters, or to exclude specific characters. A single character is represented by a question mark (?), zero or more characters are represented by a star (*), and specific characters can be entered between brackets ([]).

I/O redirection can be used to manipulate the input and output data streams in order to send this information somewhere else. Piping is the process of using the output of one command as the input for another, which allows the user to perform complex tasks which cannot be done with just one command. Pagers are utilities that can be used to scroll through the output of commands that produce more output than can be read all at once. Pagers can be used with piping so that the pager utility is being performed on the output of a certain command. The two pagers explored are the more and less utilities. The more utility allows more of the output to be displayed when the space bar is pressed, whereas the less utility opens up the output in a way that allows it to be scrolled through.

The lab also provided a further explanation of the different uses of quotes. Single quotes (‘) tells Unix to interpret all symbols as characters and to assign no meaning to characters such as spaces or dollar signs. Double quotes (“) are less strict and allow variable names and commands to be interpreted as such rather than simply as text.

Keywords

unix Keywords

Home Directory

Definition

The home directory is the main directory for each user on the system. When a user first logs on to a Unix system, their working directory will be their home directory. Each user has their own home directory in which they are the owner of all of the files. The home directory is specified by the ~ symbol.

Demonstration

To change to the home directory at any time, a user only has to enter the cd command.

lab46:/var/public/data/fall2010$ cd
lab46:~$ 

Current Working Directory

Definition

This is the directory that you are currently in and is the directory that will be shown in the command prompt. Any commands that reference files using a relative path name will use the current working directory as the starting point.

Demonstration

You can display the absolute path name of the current working directory by using the pwd (print working directory) command.

lab46:~$ pwd
/home/rhensen

Types of Files

Definition

In order for Unix to properly make use of certain files, it must know the type of file. Most files are regular files, such as text files, and are treated by Unix as simply a container or data. The next most common types of file are directories, which are used to contain other files. Aside from these two main types there are also different varieties of special files.

Demonstration

The type column of a listing of files is the leftmost character on each line. The top file is a regular file since it has a - in this field. The middle file is a directory since the type field displays a d. The last one is a type of special file called a link, which can be seen from the l.

-rw-r--r-- 1 rhensen lab46  30 Feb 10 23:12 lab03.txt
drwxr-xr-x 2 rhensen lab46   6 Jan 25 22:33 bin
lrwxrwxrwx 1 rhensen lab46  17 Jan 19 11:57 Maildir -> /var/mail/rhensen

Wildcards

Definition

A wildcard character is a symbol that is used to substitute for characters in a string. The question mark (?) symbol is used to represent one and only one character. The star (*) is used to represent zero or more characters. Square brackets ([]) can be used to match particular characters or a range of characters.

Demonstration

Alternatively (or additionally), if you want to demonstrate something on the command-line, you can do so as follows:

lab46:~$ ls file*
file1 file2 file3 file32 filea
lab46:~$ ls file?
file1 file2 file3 filea
lab46:~$ ls file[3]*
file3 file32

Tab Completion

Definition

In Unix, the Tab key can be used to complete the names of commands and path names that you have begun typing. Tab completions only work when the user has typed enough that there is only one command or path name that can be filled in by hitting Tab.

Demonstration

Here hitting Tab after each letter typed does nothing for t and te, but it eventually completes the line with an important command when tet is entered.

lab46:~$ t
lab46:~$ te
lab46:~$ tet
lab46:~$ tetris-bsd

$PATH

Definition

$PATH is an environment variable that is used to reference all of the directories within a user's search path. Whenever a command is entered, Unix will look through the directories specified in the $PATH variable in order to find that command. If this variable were not used, all commands would have to be entered with an absolute path name.

Demonstration

The value of $PATH can be displayed with the command “echo $PATH”.

lab46:~$ echo $PATH
/home/rhensen/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games

Local Host

Definition

The term local host refers to the computer that a user is currently at.

Demonstration

For example, I am currently accessing Lab 46 from my home computer using Putty. In this case my computer is the local host.

Remote Host

Definition

The term remote host refers to a machine that a user is able to access from across a network without being physically present at that machine.

Demonstration

For example, I am currently accessing Lab 46 from my home computer using Putty. In this case, Lab 46 is the remote host.

login as: login as:
login as:@lab46.corning-cc.edu's password:

unix Objective

unix Objective

My reason for taking this course is to gain familiarity with the Unix environment and develop a working knowledge how to essentially get Unix to do what I want it to.

Definition

For me, I feel that I have a good working knowledge of something when I am able to use my knowledge to solve a problem that I have not seen before. I believe that one of the most important skills to develop when working with computers is to be able to apply what you know to different situations.

Method

As I use Unix throughout the semester, either through assignments or during my own personal use, I will reflect on the ease with which I am accomplishing my desired actions.

Measurement

I will feel that I have become familiar with Unix when I am comfortable with a variety of commands, have an understanding of how the Unix shell operates, and most importantly when I am able to effectively use the manuals to better understand commands that I am not familiar with.

Analysis

This far into the course, I feel satisfied with my progress so far. I do not know a great deal of commands, but I know enough to navigate through the directories and perform some basic actions with files, such as copying. I am very comfortable with how to use path names, which is a major component of understanding any OS especially when using a CLI. I also have a good understanding of how to use certain features of the shell, such as wildcards and piping. One area that I am still having some difficulty with is using unfamiliar commands correctly. I know how to access the manual pages, but I am still finding that reading through these is difficult and it is sometimes hard for me to find useful information from these. I would like to become more comfortable with learning new commands and effectively using the documentation.

Experiments

Experiment 1

Question

Is it possible for Unix to match wildcard characters in a file name when using wildcards?

Resources

No outside resources have been used for this experiment.

Hypothesis

In cases of files that are given unconventional names, it is possible for filenames to exist that contain characters such as * or ?. Since these characters are also used to represent wildcard patterns, I was wondering if there are ways to make Unix recognize these symbols as characters in a filename instead of wildcard symbols. I believe that Unix allows for at least one method of being able to find filenames that contain these characters when using wildcards.

Experiment

To perform this experiment I will create two files which I will name “?uestlove” and “ringo*”. I will then use the ls command with different wildcard patterns such as “?*” and “* *” (which I do not believe will work) as well as using different patterns that make use of single and double quotes as well as brackets (which I believe may yield the desired results).

Data

After performing the ls command using different variations of wildcard patters, I have determined that:

  • “ls ?*” and “ls * *” will display a listing of all of the files in the current working directory and display the files in all of the subdirectories.
  • “ls [?]*”, “ls ‘?’*”, and “ls “?”*” will all match the file called ?uestlove.
  • “ls *[*]”, ls *’*’”, and “ls *”*”” will all match the file called ringo*.
  • The command “ls *[?*]*” can be used to match both files in a single listing.

Analysis

Judging from these results, it is possible to match file names that contain wildcard symbols as characters in a wildcard pattern. This is not only possible to do, but there are several methods that can be used to make Unix interpret these symbols as characters. Unix is also robust enough to allow both of these characters to be matched in a single listing using brackets. The unexpected outcome of this experiment is the both the “?*” and the “* *” listing displayed all of the files in subdirectories as well. I have briefly attempted to researched why this is the case but have not been able to find much information on it. However, determining the reason for this seems to be outside the scope of this experiment.

Conclusions

The results of this test have shown that unconventional file names do not hinder the ability of Unix to match file names using wildcards. This demonstrates three methods that can be used to identify wildcard symbols as characters so that these characters can be matched. (Note: In writing up the experiment on this page “* *” is written with a space since this webpage did not seem to like when the two stars are used together without the space.)

Experiment 2

Question

Can output redirection be used with aliases in order to easily create a log file from the output of a certain command?

Resources

This article was used to fine tune the method that I used after I was unhappy with the initial results.

http://en.wikipedia.org/wiki/Tee_(command)

Hypothesis

A log file is sometimes useful to keep a record of the output generated by the output of a command. However, this is really only useful when the output is added to the log file automatically, which is why I believe the use of the alias will be helpful. I think that setting an alias for the ls command that will redirect the output to a file will effectively keep a record of each output that command generates. This is not to say that the ls command is the most useful command to have a log file associated with, but the method can hopefully be used with other command that better lend themselves to having a log.

Experiment

In order to set the alias for the ls command to redirect its output to a log file, the following command was used.

lab46:~$ alias ls="ls >> /home/rhensen/log" 

The redirect and append option was used to redirect the output so that new output is added to the end of the log file instead of replacing the file’s previous contents each time there is new output. An absolute pathname is defined for the log file in order to avoid creating a new log file for each working directory that the command is used in.

Data

After setting this alias and using the ls command, the test worked as planned and the output was sent to the log file (this file will be created if it does not already exist). Also issuing the command again will append the output to the file instead of replacing it. Although redirecting the output to a log file works as I had hoped, this method leaves something to be desired since the output is sent only to the log file and cannot be displayed on the screen for the user to see.

Analysis

Since I was not entirely satisfied with the initial results of the experiment, I did some research into finding a solution that would allow the output to be displayed on STDOUT as well as sent to the log file. This research led me to a command called “tee” which allows output to be redirected to STDOUT as well as one or multiple files. I decided that the following command would yield the desired results. (Note: Since tee is a separate command, a pipe is used instead of redirecting the output. The –a option tells the tee command to append the file with the new output.)

lab46:~$ alias ls="ls | tee -a /home/rhensen/log"

Conclusions

This second command achieved the results that I had in mind when I wanted to set up a log file. This appears to be a valid method of tracking the output of a certain command automatically.

Experiment 3

Question

Is it possible to remove a directory by deleting the . file that is contained within the directory?

Resources

No outside resources have been used for this experiment.

Hypothesis

Each directory contains two files which are ”.“ and ”..“. The . file refers to the directory itself and the .. file refers to the parent of that directory. These files are in each directory to allow the Unix shell to use relative path names. I believe that if the . file is removed, this would logically cause the current working directory to be deleted. It would seem to be a problem if Unix allowed the user to do this, which is why I believe that removing directories cannot be done in this way.

Experiment

To test this I will create a new directory called “test” and then change to that directory. From here I will issue the commands “rm .” and “rmdir .” to see which one works.

Data

Performing both of these commands yields the following results.

lab46:~/test$ rm .
rm: cannot remove `.': Is a directory
lab46:~/test$ rmdir .
rmdir: failed to remove `.': Invalid argument

Analysis

It appears that the file type of the . file is a directory, which is why the rm command does not work. The rmdir command seems to be designed to not accept . as an argument for the command.

Conclusions

Although I wanted to test this theory, I believed that there was a chance this would not work since there would likely be an error caused by removing the current working directory. Testing this theory actually has helped me to understand why the rm command cannot be used on directory files. At first the rmdir command appeared to be redundant, but now I see that it is needed to prevent problems from being caused by the rm command.

Part 2

Entries

Entry 5: March 2, 2012

Today’s lab introduced the concept of shell scripting in Unix. Shell scripts are executable text files that contain a series of commands that are to be performed when the file is executed. This concept seemed to me to be very similar to batch files in DOS. Before executing these files, there are two factors to take into account. The first is that the permissions of the file must be set to allow the file to be executed. Secondly in order to issue the command to execute the file, the path to the script file must be specified unless it is located in a directory that $PATH will search in.

This lab introduced many concepts that are useful for writing scripts. The read and let commands can be used to set and manipulate variables. Shell scripts can also contain if statements, which allow for the script to evaluate a condition and then perform different actions based on the results. The concept of iteration can also be used in scripts to allow the same action to be performed repeatedly in a loop. This can be achieved by setting a variable as the counter and incrementing the counter each time the action is performed until the limit for the counter is reached. The counter for the loop does not have to be numeric, it is also possible to perform an action repeatedly using a list of items. This lab was useful for learning concepts that are needed when creating scripts in the Unix environment.

Entry 6: March 9, 2012

This week’s lab involved the concept of multitasking and how to work with different running processes on the system. The ps command can be issued to show a list of the running processes on the system, which each correspond to a program that is running. The lab also explains that the & character can be added to the end of a command line to make the program run in the background. Running in the background means that the program will run invisibly to the user so that it will be performed but the user will be able to do other things as it is carried out. This is useful because it allows the user to do other things on the system when there are programs that take a long time to run, or in the case of programs that need to constantly be running.

Once processes are running, they can also be suspended by using the SUSPEND character (CTRL-Z). Suspending a process will temporarily pause the process. When processes are suspended, they can be brought back to the foreground with the fg command. The kill command can be used to completely end a running or suspended process by sending a signal to the operating system. The combination of these concepts is useful for understanding how to work with and manipulate the different processes that are running on the system.

The case study associated with this lab deals with the topic of scheduling tasks, which is setting certain processes to run at specific times. This is achieved by using a utility in Unix called cron. The cron utility is the scheduler and it makes use of the crontab, which is a list of scheduled processes and the times that they should be run. The at utility is similar in function to cron, but it is designed to be used for tasks that are only meant to be scheduled once.

Entry 7: March 16, 2012

This lab dealt with the programming environment in Unix, which deals with how programs are written, compiled, and executed. The source code of a program is simply a text file that contains the code written in a certain programming language and it is therefore not executable. In order to make the program an executable binary file, it must be compiled. The first part of the lab demonstrates how to use compilers on source code files written in C, C++, and assembly. Although these languages are all different, the compilers all work mostly the same. Compilers are run by issuing their command and they accept the name of the source code file and the name of the output file as arguments. Multiple source code files can be compiled into one executable file using the Makefile utility. This is useful for complex programs that require multiple components in order to work.

The case study also had to do with the programming environment, specifically the data types that it uses. Data types have to do with how many bits are allocated to different types of data and the range of data that is able to be expressed. The values of the multiple data types on the system are stored in a file called “limits.h,” which is a plain text file for C. The way in which the different data types work is demonstrated by using a C program that can be used to display the ranges of the various data types.

Entry 8: March 30, 2012

The topic of this lab was regular expressions, which can be used to match patterns in various Unix utilities that support them. The pattern matching of regular expressions is similar to the use of wildcards, however they are more complex and offer greater control over how patterns can be defined. Once utility that regular expressions are useful for is grep. The grep utility can search for strings of text that match certain patters, which are defined with regular expressions. The lab mostly consists of exercises where I must search files for text that matches a set of criteria. This was challenging at first and required some time to become familiar with how the different symbols are properly used. After some time spent with it I realized that regular expressions are very robust and allow patterns to be defined in multiple ways. The vi editor also makes use of regular expressions and has a substitution feature that allows certain patterns of text to be replaced with other ones. This is a useful feature when dealing with length files that contain the same patterns multiple times.

The case study was also related to regular expressions and how they can be used with different versions of the grep utility. The regular grep utility is limited in that it only accepts basic regular expressions. The egrep utility expands on the functionality of grep by allowing the use of extended regular expressions metacharacters. The fgrep utility is designed to only search for literal strings, which makes it less powerful than the other versions, but quicker and easier in situations when patterns are not necessary.

Keywords

unix Keywords

Source Code

Definition

A text file containing a program that is written out in a particular programming language. This file itself is not executable.

Demonstration

This is the source code of a simple program in C.

#include <stdio.h>
 
int main()
{
        puts("Hello, World!\n");
        return(0);
}

Compiler

Definition

A compiler is able to create executable code from a source code file.

Demonstration

This shows how to compile a source code file called “helloC.c” into an executable file called “helloC”.

lab46:~/devel$ gcc -o helloC helloC.c

Pattern Matching

Definition

Certain Unix utilities support regular expressions, which can be used to define a pattern of characters. Utilities that support this will then be able to find strings of characters that adhere to this pattern.

Demonstration

This shows how pattern matching can be used to find lines of text that match a defined pattern.

lab46:~$ cat pattern
1. dog
2. cow
3. horse
4. cat
lab46:~$ grep c.. pattern
2. cow
4. cat

Regular Expression

Definition

A regular expression is a means of defining a formula that can be used as a stand in for a string of text. These are used to specify a pattern so that lines of text following that pattern can be found.

Demonstration

This shows how lines of text can be matched to a regular expression. The regular expression is defined as a string that begins with the letters c or d followed by two more characters.

lab46:~$ cat pattern
1. dog
2. cow
3. horse
4. cat
lab46:~$ grep [c-d].. pattern
1. dog
2. cow
4. cat

Job Control

Definition

Job control allows the user to view and manage the multiple processes that are running on the system.

Demonstration

The ps command can be used to show running processes on the system.

lab46:~$ ps
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
rhensen   1236  0.0  0.0  13592     8 pts/0    SNs  Jan25   0:00 /bin/bash
rhensen   1257  0.0  0.1  42608  2332 pts/0    SN+  Jan25   9:21 irssi
rhensen  20632  0.0  0.1  13640  2012 pts/27   SNs  01:30   0:00 -bash
rhensen  21082  0.0  0.0   8588  1000 pts/27   RN+  02:04   0:00 ps u

Suspend

Definition

Suspending a process allows that process to be paused, but not stopped entirely. This is done by issuing the SUSPEND character to the program (CTRL-Z).

Demonstration

The jobs command can be used to show programs that are running. Stopped jobs have been suspended, which means they are still in memory but are not currently running.

lab46:~$ jobs
[1]+  Stopped                 man man

Text Processing

Definition

Text processing involves the creation and manipulation of the ASCII data within text files.

Demonstration

There are various programs available that can be used to create or edit text. In the example below, three text editors are shown. If the text file called “text” already exists it will be opened for manipulation in the program. If the file does not exist it will be created.

lab46:~$vi text
lab46:~$vim text
lab46:~$pico text

The VI Editor

Definition

VI is a text editing utility created for Unix. It is a modal text editor, which means that there are two modes that it can operate in. The command mode interprets input as commands and the insert mode interprets input as text that should be typed into the file.

Demonstration

This is what a VI screen generally looks like. The lines of text are displayed at the top of the screen and commands that are issued are displayed at the bottom.

Hello. This is text.
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
"text" [New File]                                             0,0-1         All

unix Objective

unix Objective

To become more familiar with how the Unix shell operates.

Definition

Since I have been working with Unix in a command line environment, learning to use the features and understanding what the system is doing is less transparent that it is when working in a GUI. I want to obtain a better understanding of how Unix works and what happens when commands are issued.

Method

I will feel that I have gained an understanding in this area when I have a firm understanding of how the Unix system operates apart from its utilities.

Measurement

I would like to become more familiar with various aspects of the Unix environment, such as how to manage processes, how to find files, and make use of the device files.

Analysis

The lab that involved job management was particularly useful in this area. I learned a lot about how to see what processes are running on the system and the different ways to manage them. The differences between killing, suspending, and interrupting a process for example are important concepts for working with the operating system. I also found it interesting that Unix allows communication between users by writing to a device file that the user is connected through. I feel that I have also become more comfortable with using utilities like the find command and grep. I believe that these concepts have enhanced my overall understanding of what makes the operating system work when it has been broken down.

Experiments

Experiment 4

Question

What is the question you'd like to pose for experimentation? State it here.

Resources

Collect information and resources (such as URLs of web resources), and comment on knowledge obtained that you think will provide useful background information to aid in performing the experiment.

Hypothesis

Based on what you've read with respect to your original posed question, what do you think will be the result of your experiment (ie an educated guess based on the facts known). This is done before actually performing the experiment.

State your rationale.

Experiment

How are you going to test your hypothesis? What is the structure of your experiment?

Data

Perform your experiment, and collect/document the results here.

Analysis

Based on the data collected:

  • Was your hypothesis correct?
  • Was your hypothesis not applicable?
  • Is there more going on than you originally thought? (shortcomings in hypothesis)
  • What shortcomings might there be in your experiment?
  • What shortcomings might there be in your data?

Conclusions

What can you ascertain based on the experiment performed and data collected? Document your findings here; make a statement as to any discoveries you've made.

Experiment 5

Question

What is the question you'd like to pose for experimentation? State it here.

Resources

Collect information and resources (such as URLs of web resources), and comment on knowledge obtained that you think will provide useful background information to aid in performing the experiment.

Hypothesis

Based on what you've read with respect to your original posed question, what do you think will be the result of your experiment (ie an educated guess based on the facts known). This is done before actually performing the experiment.

State your rationale.

Experiment

How are you going to test your hypothesis? What is the structure of your experiment?

Data

Perform your experiment, and collect/document the results here.

Analysis

Based on the data collected:

  • Was your hypothesis correct?
  • Was your hypothesis not applicable?
  • Is there more going on than you originally thought? (shortcomings in hypothesis)
  • What shortcomings might there be in your experiment?
  • What shortcomings might there be in your data?

Conclusions

What can you ascertain based on the experiment performed and data collected? Document your findings here; make a statement as to any discoveries you've made.

Retest 2

State Experiment

The experiment that I am retesting is Mason Faucett's second experiment shown here:

http://lab46.corning-cc.edu/opus/spring2012/mfaucet2/start#experiment_2

The question posed in this experiment is whether or not it is possible to remove a directory while there are still files in it.

Resources

A resource that I would like to add to this experiment is the Wikipedia articles for the rmdir and rm commands.

http://en.wikipedia.org/wiki/Rmdir

http://en.wikipedia.org/wiki/Rm_(Unix)

These articles are useful for explaining how these two commands work and how they can be used to achieve the results that the original experiment was looking for.

Hypothesis

The original experiment's hypothesis is that it is possible to remove a directory that contains files, since in a GUI it is possible to remove a folder that contains other files. I believe that the original experiment was correct in that it is only possible to remove an empty using the rmdir command (the command's description provided in the manual page reads “remove empty directories”). However, I have found in an option for the rm command that I believe may achieve this effect.

Experiment

I was able to perform the original experiment again to show that the rmdir command cannot be used to remove a non-empty directory. What I would like to add to the experiment is attempting to remove a directory with the rm command and the -r option. The manual page for rm says that the -r option can be used to “remove directories and their contents recursively.”

Data

The results of attempting to remove a directory called “direct” with the rm -r command are shown here.

lab46:~$ rm -r direct
rm: descend into directory `direct'? y
rm: remove regular empty file `direct/file1'? y
rm: remove regular empty file `direct/file2'? y
rm: remove directory `direct'? y

Analysis

While the -r option works as described and the directory and its contents were removed, there was a prompt to confirm that each file should be removed. This seems to be a problem in the case of removing directories with a large number of files. I decided to try to streamline this process by adding the -f option as well, which will make the rm command never prompt. The results of removing the same directory called “direct” are below.

lab46:~$ rm -rf direct

By simply issuing the command, the directory and all of it's contents are removed without prompting the user or displaying any messages.

Conclusions

The rm command will actually remove a directory by first removing the contents and then the directory itself, which means that once again only an empty directory can be removed. In this sense, the original experiment was correct. However, this invocation of the rm command seems to be consistent with the results that the original experiment was trying to achieve since it removes a directory and its contents with a single command.

Part 3

Entries

Entry 9: April 6, 2012

This week’s case study was focused on the topic of data manipulation. The two main utilities that I was introduced to through this activity were the data dump utility (dd) and a binary editor (bvi). The data dump command can be used to copy the contents of one file into another. This utility is especially interesting since it allows the user to specify which blocks of data to copy, allowing the user to essentially pick and choose which bytes of a file should be moved. The binary editor is similar to the vi editor, only it operates on binary data instead of text. When viewing a file through a binary editor, every two byes of information is displayed as a series of hex values.

The most interesting aspect of this activity for me was extracting other files from a larger one. To do this, I was provided with a file that I viewed with the bvi utility. The first 3 kb of data and the majority of the file was shown to be filled entirely with zeros. However, there were three ranges where there was other information present. The data dump utility could be used to extract the information contained in these ranges. To ensure that I extracted the correct data, I used the file command to ensure that the files could be recognized as an actual file type. When extracted these three ranges were revealed to be an executable file, a text file, and a gzip compressed file, all of which contained messages. I found this lab very interesting since it demonstrated how each bit of data contained in a file can be moved around or edited.

Entry 10: April 19, 2012

This lab focuses on the use of filters. These filers were applied to a text file which contains a database of students with various pieces of information. Filters can be applied to this file through the use of pipes in order to sort through the data and display relevant information. Many of the filtering techniques used in this lab have already been explored in some capacity. The grep utility is used in order to search through the database entries based on some criteria. The sed utility is also used to edit the output to change what information is displayed.

The cut utility is introduced in this activity, and in many ways it is better suited for manipulating the output in this circumstance than sed is. The cut utility allows the user to specify a character or a string of characters that separates the fields of data and then specify which fields should be removed from the output. Another new utility is tr, which is used to translate certain strings of characters to another string and functions very similarly to sed’s substitution function. The head and tail programs are used to display only the first or last several lines of output respectively.

Entry 11: April 20, 2012

This case study dealt with the concept of groups and security features of Unix. This topic mostly deals with file permissions, which are used to specify what actions different groups of users are allowed to do with a file. The different actions that a user is allowed to do (or prevented from doing) to a file are read, write, and search or execute depending on the file type. These permissions are different for the file’s owner, the security group associated with the file, and everyone else. These permissions can be symbolically, as they are in a long directory listing, or as an octal value and they can be changed with the chmod utility.

This activity also demonstrated how to determine user and group ID numbers. Each user has a unique ID number (with the root user being 0) that Unix uses to identify them, and similarly each group is also identified by a number. The concept of a umask is also introduced, which is used to specify the permissions that are given to a file when it is created. A umask is defined by three octal numbers (one for each type of user) that is applied to the default permission to specify which permissions should be changed. This case study was useful for demonstrating how permissions can be manipulated and how they affect access to different files on a system.

Entry 12: April Day, 2012

This is a sample format for a dated entry. Please substitute the actual date for “Month Day, Year”, and duplicate the level 4 heading to make additional entries.

As an aid, feel free to use the following questions to help you generate content for your entries:

  • What action or concept of significance, as related to the course, did you experience on this date?
  • Why was this significant?
  • What concepts are you dealing with that may not make perfect sense?
  • What challenges are you facing with respect to the course?

Remember that 4 is just the minimum number of entries. Feel free to have more.

unix Keywords

Shell Scripting
Definition

A shell script is an executable text file that contains a list of commands and operations to be performed by a command line interpreter.

Demonstration
#!/bin/bash
echo "Please enter your birth year"
read birth
let year=`date +%Y`
let age=$year-$birth
echo $age
lab46:~$ ./age.sh
Please enter your birth year
1986
26
Filtering
Definition

Filtering can be applied to a set of data in order to exclude unnecessary information.

Demonstration

This example shows how a filter can be applied to display only the first 3 lines of a text file.

lab46:~$ head -3 sample.db
name:sid:major:year:favorite candy*
Jim Smith:105743:Economics:Sophomore:Lollipops*
Adelle Wilson:594893:Sociology:Junior:Ju-Ju Fish*
Regular Files
Definition

Regular files is a term used to distinguish them from other types of files called special files. Regular files are specified by a “-: for the file flag in a long directory listing. Types of regular files are text files, binary data files, and executable files.

Demonstration

A long listing displaying a regular file.

lab46:~$ ls -l file
-rw-r----- 1 rhensen lab46 90 May  8 21:46 file
Directory
Definition

A directory is a file type that is able to contain other files. A directory is marked with a “d” for the file flag in a long directory listing. Directories are the most common type of special files found in Unix.

Demonstration

A long listing of a directory file.

lab46:~$ ls -ld bin
drwxr-xr-x 2 rhensen lab46 18 Mar  9 22:49 bin
Permissions
Definition

Permissions specify how users are able to access a file. Permissions include being able to read, write to, and search or execute a file depending on the type of file it is. Permissions are defined separately for the owner of the file, the security group associated with the file, and everyone else. These are displayed symbolically as a list of three sets of rwx bits.

Demonstration

A long directory listing displays the symbolic permissions for each type of user. The first set applies to the owner of the file, the second to the group, and the third to the world. The owner of the file and the group associated with it are displayed afterwards.

drwxr-xr-x 2 rhensen lab46    4096 Apr  7 02:10 courselist
-rw-r--r-- 1 rhensen lab46     666 Feb  9 21:50 courses.html
-rwxrwxrwx 1 rhensen lab46     794 May  3 13:03 cs0xd.sh
-rw-r----- 1 rhensen lab46    8186 Apr 19 16:59 data.file
umask
Definition

The umask is a value that can be specified by the user to set the permissions for new files that are created. A umask specifies which permissions should not be given when a new file is created.

Demonstration

This shows that when a umask is set to 000 (will not change default permissions) a regular file is created with the permission 666. When a umas of 022 is set, a new regular file that is created has a permission of 644.

lab46:~$ umask 000
lab46:~$ touch test1
lab46:~$ ls -l test1
-rw-rw-rw- 1 rhensen lab46 0 May  9 23:17 test1
lab46:~$ umask 022
lab46:~$ touch test2
lab46:~$ ls -l test2
-rw-r--r-- 1 rhensen lab46 0 May  9 23:18 test2
Data Dump
Definition

The data dump program (dd) can be used to transfer data from one file into another file. The user has a great deal of control when specifying which bytes of data are to be copied over and where to place them in the output file.

Demonstration

An example of dd being used to extract all the information of a file called “pattern” into a file called “test1”.

lab46:~$ dd if=pattern of=test1
0+1 records in
0+1 records out
30 bytes (30 B) copied, 0.0437978 s, 0.7 kB/s 
Binary Editor
Definition

This concept is similar to a text editor, only dealing with binary data instead of text. A binary editor, such as bvi, allows a user to view and edit each byte of a file.

Demonstration

This is what a screen in bvi looks like. The leftmost column is the line numbers. The far right is the ASCII equivalent of the binary data (although this is typically meaningless when not dealing with text files). The middle shows the values of the bytes written in hexadecimal.

00000000  6E 61 6D 65 3A 73 69 64 3A 6D 61 6A 6F 72 3A 79 name:sid:major:y
00000010  65 61 72 3A 66 61 76 6F 72 69 74 65 20 63 61 6E ear:favorite can
00000020  64 79 2A 0A 4A 69 6D 20 53 6D 69 74 68 3A 31 30 dy*.Jim Smith:10
00000030  35 37 34 33 3A 45 63 6F 6E 6F 6D 69 63 73 3A 53 5743:Economics:S
00000040  6F 70 68 6F 6D 6F 72 65 3A 4C 6F 6C 6C 69 70 6F ophomore:Lollipo
00000050  70 73 2A 0A 41 64 65 6C 6C 65 20 57 69 6C 73 6F ps*.Adelle Wilso
00000060  6E 3A 35 39 34 38 39 33 3A 53 6F 63 69 6F 6C 6F n:594893:Sociolo
00000070  67 79 3A 4A 75 6E 69 6F 72 3A 4A 75 2D 4A 75 20 gy:Junior:Ju-Ju
00000080  46 69 73 68 2A 0A 53 61 72 61 68 20 42 69 6C 6C Fish*.Sarah Bill
00000090  69 6E 67 73 3A 39 33 38 33 38 39 3A 41 63 63 6F ings:938389:Acco
000000A0  75 6E 74 69 6E 67 3A 46 72 65 73 68 6D 61 6E 3A unting:Freshman:
000000B0  54 69 63 2D 54 61 63 73 2A 0A 45 72 69 63 20 56 Tic-Tacs*.Eric V
000000C0  69 6E 63 65 6E 74 3A 31 30 30 31 31 31 39 3A 42 incent:1001119:B
000000D0  69 6F 6C 6F 67 79 3A 46 72 65 73 68 6D 61 6E 3A iology:Freshman:
000000E0  4C 6F 6C 6C 69 70 6F 70 73 2A 0A 4C 69 6E 75 73 Lollipops*.Linus
000000F0  20 54 6F 72 76 61 6C 64 73 3A 34 34 33 32 30 30  Torvalds:443200
00000100  31 3A 43 6F 6D 70 75 74 65 72 20 53 63 69 65 6E 1:Computer Scien
00000110  63 65 3A 53 65 6E 69 6F 72 3A 53 6E 69 63 6B 65 ce:Senior:Snicke
00000120  72 73 2A 0A 41 6C 61 6E 20 43 6F 78 3A 34 30 30 rs*.Alan Cox:400
00000130  34 39 33 30 30 3A 43 6F 6D 70 75 74 65 72 20 53 49300:Computer S
00000140  63 69 65 6E 63 65 3A 53 65 6E 69 6F 72 3A 57 68 cience:Senior:Wh
00000150  6F 70 70 65 72 73 2A 0A 41 6C 61 6E 20 54 75 72 oppers*.Alan Tur
00000160  69 6E 67 3A 34 30 30 33 30 33 33 33 3A 43 6F 6D ing:40030333:Com
"sample.db" 898 bytes                          00000000  \156 0x6E 110 'n'

unix Objective

unix Objective

To become familiar with more complex Unix concepts.

Definition

My objective for this portion of the semester was to become more familiar with some of the more complex concepts in Unix, such as regular expressions and scripting.

Method

To determine my progress in this area I will look at my comprehension of how to use the different symbols of regular expressions. It is also useful to be able to think in terms of regular expressions and use them to obtain desired results.

Measurement

I will be able to measure my progress by looking at how I am doing at solving exercises that involve the use of regular expressions and writing scripts.

Analysis

I believe that in this portion of the semester I have become more proficient in using regular expressions effectively. When the concept was first introduced to me it reminded me of wildcards, which are also used to find matches to patterns. I felt that wildcards were fairly easy to use and straightforward. Understanding how to use them has made regular expressions easier, however the complexity of RegEx is much greater and there is much more that can be done with them. I feel that after working through the various labs in this portion of the semester I have a better understanding of how regular expressions are used. I believe I still sometimes forget the way that different tools handle RegEx differently, such as grep being unable to handle extended characters. I still find regular expressions challenging especially when implementing them into a script to perform complex functions, but I feel that I have a good understanding of the concept and can use them effectively.

Experiments

Experiment 7

Question

Is it possible to use the dd command to combine text files.

Resources

The manual page for dd was used to formulate this experiment.

Hypothesis

The hypothesis that I would like to test is that it is possible to use the dd command to extract the contents of text files into another file in order to combine the contents of text files. I believe that this can be done, although I am testing it because I believe that it is possible that text files may contain some file header information that cannot be viewed. If such information precedes a text file, I believe this test will not work. However, my understanding of text files is that the only contain text with no extraneous formatting information.

Experiment

For this experiment I will create two text files that will be extracted to the same destination file. To avoid the second dd command from overwriting the first, I will use the seek option to put the second set of data after the first.

Data

Performing the experiment yielded the follwoing results:

lab46:~$ cat small1
the answer to life, the universe, and everything
lab46:~$ cat small2
42
lab46:~$ dd if=small1 of=big
0+1 records in
0+1 records out
49 bytes (49 B) copied, 0.0419779 s, 1.2 kB/s
lab46:~$ ls -l big
-rw-r--r-- 1 rhensen lab46 49 May  9 21:53 big
lab46:~$ dd if=small2 of=big seek=49
0+1 records in
0+1 records out
3 bytes (3 B) copied, 0.0154834 s, 0.2 kB/s
lab46:~$ cat big
the answer to life, the universe, and everything
42

Analysis

The results of this experiment show that extracting the contents of different text files into one file will still be readable. I was unsure of whether or not the contents of each file would appear on separate lines or as a single line, but these results show that each file's contents appear as a separate line.

Conclusions

This test shows that it is possible to merge text files using this method. The fact that both lines are readable also shows that there is no file header information that interferes with the lines of text being readable.

Experiment 8

Question

What is the question you'd like to pose for experimentation? State it here.

Resources

Collect information and resources (such as URLs of web resources), and comment on knowledge obtained that you think will provide useful background information to aid in performing the experiment.

Hypothesis

Based on what you've read with respect to your original posed question, what do you think will be the result of your experiment (ie an educated guess based on the facts known). This is done before actually performing the experiment.

State your rationale.

Experiment

How are you going to test your hypothesis? What is the structure of your experiment?

Data

Perform your experiment, and collect/document the results here.

Analysis

Based on the data collected:

  • Was your hypothesis correct?
  • Was your hypothesis not applicable?
  • Is there more going on than you originally thought? (shortcomings in hypothesis)
  • What shortcomings might there be in your experiment?
  • What shortcomings might there be in your data?

Conclusions

What can you ascertain based on the experiment performed and data collected? Document your findings here; make a statement as to any discoveries you've made.

Retest 3

Perform the following steps:

State Experiment

Whose existing experiment are you going to retest? Provide the URL, note the author, and restate their question.

Resources

Evaluate their resources and commentary. Answer the following questions:

  • Do you feel the given resources are adequate in providing sufficient background information?
  • Are there additional resources you've found that you can add to the resources list?
  • Does the original experimenter appear to have obtained a necessary fundamental understanding of the concepts leading up to their stated experiment?
  • If you find a deviation in opinion, state why you think this might exist.

Hypothesis

State their experiment's hypothesis. Answer the following questions:

  • Do you feel their hypothesis is adequate in capturing the essence of what they're trying to discover?
  • What improvements could you make to their hypothesis, if any?

Experiment

Follow the steps given to recreate the original experiment. Answer the following questions:

  • Are the instructions correct in successfully achieving the results?
  • Is there room for improvement in the experiment instructions/description? What suggestions would you make?
  • Would you make any alterations to the structure of the experiment to yield better results? What, and why?

Data

Publish the data you have gained from your performing of the experiment here.

Analysis

Answer the following:

  • Does the data seem in-line with the published data from the original author?
  • Can you explain any deviations?
  • How about any sources of error?
  • Is the stated hypothesis adequate?

Conclusions

Answer the following:

  • What conclusions can you make based on performing the experiment?
  • Do you feel the experiment was adequate in obtaining a further understanding of a concept?
  • Does the original author appear to have gotten some value out of performing the experiment?
  • Any suggestions or observations that could improve this particular process (in general, or specifically you, or specifically for the original author).