User Tools

Site Tools


opus:spring2012:rhensen:part3

Part 3

Entries

Entry 9: April 6, 2012

This week’s case study was focused on the topic of data manipulation. The two main utilities that I was introduced to through this activity were the data dump utility (dd) and a binary editor (bvi). The data dump command can be used to copy the contents of one file into another. This utility is especially interesting since it allows the user to specify which blocks of data to copy, allowing the user to essentially pick and choose which bytes of a file should be moved. The binary editor is similar to the vi editor, only it operates on binary data instead of text. When viewing a file through a binary editor, every two byes of information is displayed as a series of hex values.

The most interesting aspect of this activity for me was extracting other files from a larger one. To do this, I was provided with a file that I viewed with the bvi utility. The first 3 kb of data and the majority of the file was shown to be filled entirely with zeros. However, there were three ranges where there was other information present. The data dump utility could be used to extract the information contained in these ranges. To ensure that I extracted the correct data, I used the file command to ensure that the files could be recognized as an actual file type. When extracted these three ranges were revealed to be an executable file, a text file, and a gzip compressed file, all of which contained messages. I found this lab very interesting since it demonstrated how each bit of data contained in a file can be moved around or edited.

Entry 10: April 19, 2012

This lab focuses on the use of filters. These filers were applied to a text file which contains a database of students with various pieces of information. Filters can be applied to this file through the use of pipes in order to sort through the data and display relevant information. Many of the filtering techniques used in this lab have already been explored in some capacity. The grep utility is used in order to search through the database entries based on some criteria. The sed utility is also used to edit the output to change what information is displayed.

The cut utility is introduced in this activity, and in many ways it is better suited for manipulating the output in this circumstance than sed is. The cut utility allows the user to specify a character or a string of characters that separates the fields of data and then specify which fields should be removed from the output. Another new utility is tr, which is used to translate certain strings of characters to another string and functions very similarly to sed’s substitution function. The head and tail programs are used to display only the first or last several lines of output respectively.

Entry 11: April 20, 2012

This case study dealt with the concept of groups and security features of Unix. This topic mostly deals with file permissions, which are used to specify what actions different groups of users are allowed to do with a file. The different actions that a user is allowed to do (or prevented from doing) to a file are read, write, and search or execute depending on the file type. These permissions are different for the file’s owner, the security group associated with the file, and everyone else. These permissions can be symbolically, as they are in a long directory listing, or as an octal value and they can be changed with the chmod utility.

This activity also demonstrated how to determine user and group ID numbers. Each user has a unique ID number (with the root user being 0) that Unix uses to identify them, and similarly each group is also identified by a number. The concept of a umask is also introduced, which is used to specify the permissions that are given to a file when it is created. A umask is defined by three octal numbers (one for each type of user) that is applied to the default permission to specify which permissions should be changed. This case study was useful for demonstrating how permissions can be manipulated and how they affect access to different files on a system.

Entry 12: April Day, 2012

This is a sample format for a dated entry. Please substitute the actual date for “Month Day, Year”, and duplicate the level 4 heading to make additional entries.

As an aid, feel free to use the following questions to help you generate content for your entries:

  • What action or concept of significance, as related to the course, did you experience on this date?
  • Why was this significant?
  • What concepts are you dealing with that may not make perfect sense?
  • What challenges are you facing with respect to the course?

Remember that 4 is just the minimum number of entries. Feel free to have more.

unix Keywords

Shell Scripting
Definition

A shell script is an executable text file that contains a list of commands and operations to be performed by a command line interpreter.

Demonstration
#!/bin/bash
echo "Please enter your birth year"
read birth
let year=`date +%Y`
let age=$year-$birth
echo $age
lab46:~$ ./age.sh
Please enter your birth year
1986
26
Filtering
Definition

Filtering can be applied to a set of data in order to exclude unnecessary information.

Demonstration

This example shows how a filter can be applied to display only the first 3 lines of a text file.

lab46:~$ head -3 sample.db
name:sid:major:year:favorite candy*
Jim Smith:105743:Economics:Sophomore:Lollipops*
Adelle Wilson:594893:Sociology:Junior:Ju-Ju Fish*
Regular Files
Definition

Regular files is a term used to distinguish them from other types of files called special files. Regular files are specified by a “-: for the file flag in a long directory listing. Types of regular files are text files, binary data files, and executable files.

Demonstration

A long listing displaying a regular file.

lab46:~$ ls -l file
-rw-r----- 1 rhensen lab46 90 May  8 21:46 file
Directory
Definition

A directory is a file type that is able to contain other files. A directory is marked with a “d” for the file flag in a long directory listing. Directories are the most common type of special files found in Unix.

Demonstration

A long listing of a directory file.

lab46:~$ ls -ld bin
drwxr-xr-x 2 rhensen lab46 18 Mar  9 22:49 bin
Permissions
Definition

Permissions specify how users are able to access a file. Permissions include being able to read, write to, and search or execute a file depending on the type of file it is. Permissions are defined separately for the owner of the file, the security group associated with the file, and everyone else. These are displayed symbolically as a list of three sets of rwx bits.

Demonstration

A long directory listing displays the symbolic permissions for each type of user. The first set applies to the owner of the file, the second to the group, and the third to the world. The owner of the file and the group associated with it are displayed afterwards.

drwxr-xr-x 2 rhensen lab46    4096 Apr  7 02:10 courselist
-rw-r--r-- 1 rhensen lab46     666 Feb  9 21:50 courses.html
-rwxrwxrwx 1 rhensen lab46     794 May  3 13:03 cs0xd.sh
-rw-r----- 1 rhensen lab46    8186 Apr 19 16:59 data.file
umask
Definition

The umask is a value that can be specified by the user to set the permissions for new files that are created. A umask specifies which permissions should not be given when a new file is created.

Demonstration

This shows that when a umask is set to 000 (will not change default permissions) a regular file is created with the permission 666. When a umas of 022 is set, a new regular file that is created has a permission of 644.

lab46:~$ umask 000
lab46:~$ touch test1
lab46:~$ ls -l test1
-rw-rw-rw- 1 rhensen lab46 0 May  9 23:17 test1
lab46:~$ umask 022
lab46:~$ touch test2
lab46:~$ ls -l test2
-rw-r--r-- 1 rhensen lab46 0 May  9 23:18 test2
Data Dump
Definition

The data dump program (dd) can be used to transfer data from one file into another file. The user has a great deal of control when specifying which bytes of data are to be copied over and where to place them in the output file.

Demonstration

An example of dd being used to extract all the information of a file called “pattern” into a file called “test1”.

lab46:~$ dd if=pattern of=test1
0+1 records in
0+1 records out
30 bytes (30 B) copied, 0.0437978 s, 0.7 kB/s 
Binary Editor
Definition

This concept is similar to a text editor, only dealing with binary data instead of text. A binary editor, such as bvi, allows a user to view and edit each byte of a file.

Demonstration

This is what a screen in bvi looks like. The leftmost column is the line numbers. The far right is the ASCII equivalent of the binary data (although this is typically meaningless when not dealing with text files). The middle shows the values of the bytes written in hexadecimal.

00000000  6E 61 6D 65 3A 73 69 64 3A 6D 61 6A 6F 72 3A 79 name:sid:major:y
00000010  65 61 72 3A 66 61 76 6F 72 69 74 65 20 63 61 6E ear:favorite can
00000020  64 79 2A 0A 4A 69 6D 20 53 6D 69 74 68 3A 31 30 dy*.Jim Smith:10
00000030  35 37 34 33 3A 45 63 6F 6E 6F 6D 69 63 73 3A 53 5743:Economics:S
00000040  6F 70 68 6F 6D 6F 72 65 3A 4C 6F 6C 6C 69 70 6F ophomore:Lollipo
00000050  70 73 2A 0A 41 64 65 6C 6C 65 20 57 69 6C 73 6F ps*.Adelle Wilso
00000060  6E 3A 35 39 34 38 39 33 3A 53 6F 63 69 6F 6C 6F n:594893:Sociolo
00000070  67 79 3A 4A 75 6E 69 6F 72 3A 4A 75 2D 4A 75 20 gy:Junior:Ju-Ju
00000080  46 69 73 68 2A 0A 53 61 72 61 68 20 42 69 6C 6C Fish*.Sarah Bill
00000090  69 6E 67 73 3A 39 33 38 33 38 39 3A 41 63 63 6F ings:938389:Acco
000000A0  75 6E 74 69 6E 67 3A 46 72 65 73 68 6D 61 6E 3A unting:Freshman:
000000B0  54 69 63 2D 54 61 63 73 2A 0A 45 72 69 63 20 56 Tic-Tacs*.Eric V
000000C0  69 6E 63 65 6E 74 3A 31 30 30 31 31 31 39 3A 42 incent:1001119:B
000000D0  69 6F 6C 6F 67 79 3A 46 72 65 73 68 6D 61 6E 3A iology:Freshman:
000000E0  4C 6F 6C 6C 69 70 6F 70 73 2A 0A 4C 69 6E 75 73 Lollipops*.Linus
000000F0  20 54 6F 72 76 61 6C 64 73 3A 34 34 33 32 30 30  Torvalds:443200
00000100  31 3A 43 6F 6D 70 75 74 65 72 20 53 63 69 65 6E 1:Computer Scien
00000110  63 65 3A 53 65 6E 69 6F 72 3A 53 6E 69 63 6B 65 ce:Senior:Snicke
00000120  72 73 2A 0A 41 6C 61 6E 20 43 6F 78 3A 34 30 30 rs*.Alan Cox:400
00000130  34 39 33 30 30 3A 43 6F 6D 70 75 74 65 72 20 53 49300:Computer S
00000140  63 69 65 6E 63 65 3A 53 65 6E 69 6F 72 3A 57 68 cience:Senior:Wh
00000150  6F 70 70 65 72 73 2A 0A 41 6C 61 6E 20 54 75 72 oppers*.Alan Tur
00000160  69 6E 67 3A 34 30 30 33 30 33 33 33 3A 43 6F 6D ing:40030333:Com
"sample.db" 898 bytes                          00000000  \156 0x6E 110 'n'

unix Objective

unix Objective

To become familiar with more complex Unix concepts.

Definition

My objective for this portion of the semester was to become more familiar with some of the more complex concepts in Unix, such as regular expressions and scripting.

Method

To determine my progress in this area I will look at my comprehension of how to use the different symbols of regular expressions. It is also useful to be able to think in terms of regular expressions and use them to obtain desired results.

Measurement

I will be able to measure my progress by looking at how I am doing at solving exercises that involve the use of regular expressions and writing scripts.

Analysis

I believe that in this portion of the semester I have become more proficient in using regular expressions effectively. When the concept was first introduced to me it reminded me of wildcards, which are also used to find matches to patterns. I felt that wildcards were fairly easy to use and straightforward. Understanding how to use them has made regular expressions easier, however the complexity of RegEx is much greater and there is much more that can be done with them. I feel that after working through the various labs in this portion of the semester I have a better understanding of how regular expressions are used. I believe I still sometimes forget the way that different tools handle RegEx differently, such as grep being unable to handle extended characters. I still find regular expressions challenging especially when implementing them into a script to perform complex functions, but I feel that I have a good understanding of the concept and can use them effectively.

Experiments

Experiment 7

Question

Is it possible to use the dd command to combine text files.

Resources

The manual page for dd was used to formulate this experiment.

Hypothesis

The hypothesis that I would like to test is that it is possible to use the dd command to extract the contents of text files into another file in order to combine the contents of text files. I believe that this can be done, although I am testing it because I believe that it is possible that text files may contain some file header information that cannot be viewed. If such information precedes a text file, I believe this test will not work. However, my understanding of text files is that the only contain text with no extraneous formatting information.

Experiment

For this experiment I will create two text files that will be extracted to the same destination file. To avoid the second dd command from overwriting the first, I will use the seek option to put the second set of data after the first.

Data

Performing the experiment yielded the follwoing results:

lab46:~$ cat small1
the answer to life, the universe, and everything
lab46:~$ cat small2
42
lab46:~$ dd if=small1 of=big
0+1 records in
0+1 records out
49 bytes (49 B) copied, 0.0419779 s, 1.2 kB/s
lab46:~$ ls -l big
-rw-r--r-- 1 rhensen lab46 49 May  9 21:53 big
lab46:~$ dd if=small2 of=big seek=49
0+1 records in
0+1 records out
3 bytes (3 B) copied, 0.0154834 s, 0.2 kB/s
lab46:~$ cat big
the answer to life, the universe, and everything
42

Analysis

The results of this experiment show that extracting the contents of different text files into one file will still be readable. I was unsure of whether or not the contents of each file would appear on separate lines or as a single line, but these results show that each file's contents appear as a separate line.

Conclusions

This test shows that it is possible to merge text files using this method. The fact that both lines are readable also shows that there is no file header information that interferes with the lines of text being readable.

Experiment 8

Question

What is the question you'd like to pose for experimentation? State it here.

Resources

Collect information and resources (such as URLs of web resources), and comment on knowledge obtained that you think will provide useful background information to aid in performing the experiment.

Hypothesis

Based on what you've read with respect to your original posed question, what do you think will be the result of your experiment (ie an educated guess based on the facts known). This is done before actually performing the experiment.

State your rationale.

Experiment

How are you going to test your hypothesis? What is the structure of your experiment?

Data

Perform your experiment, and collect/document the results here.

Analysis

Based on the data collected:

  • Was your hypothesis correct?
  • Was your hypothesis not applicable?
  • Is there more going on than you originally thought? (shortcomings in hypothesis)
  • What shortcomings might there be in your experiment?
  • What shortcomings might there be in your data?

Conclusions

What can you ascertain based on the experiment performed and data collected? Document your findings here; make a statement as to any discoveries you've made.

Retest 3

Perform the following steps:

State Experiment

Whose existing experiment are you going to retest? Provide the URL, note the author, and restate their question.

Resources

Evaluate their resources and commentary. Answer the following questions:

  • Do you feel the given resources are adequate in providing sufficient background information?
  • Are there additional resources you've found that you can add to the resources list?
  • Does the original experimenter appear to have obtained a necessary fundamental understanding of the concepts leading up to their stated experiment?
  • If you find a deviation in opinion, state why you think this might exist.

Hypothesis

State their experiment's hypothesis. Answer the following questions:

  • Do you feel their hypothesis is adequate in capturing the essence of what they're trying to discover?
  • What improvements could you make to their hypothesis, if any?

Experiment

Follow the steps given to recreate the original experiment. Answer the following questions:

  • Are the instructions correct in successfully achieving the results?
  • Is there room for improvement in the experiment instructions/description? What suggestions would you make?
  • Would you make any alterations to the structure of the experiment to yield better results? What, and why?

Data

Publish the data you have gained from your performing of the experiment here.

Analysis

Answer the following:

  • Does the data seem in-line with the published data from the original author?
  • Can you explain any deviations?
  • How about any sources of error?
  • Is the stated hypothesis adequate?

Conclusions

Answer the following:

  • What conclusions can you make based on performing the experiment?
  • Do you feel the experiment was adequate in obtaining a further understanding of a concept?
  • Does the original author appear to have gotten some value out of performing the experiment?
  • Any suggestions or observations that could improve this particular process (in general, or specifically you, or specifically for the original author).
opus/spring2012/rhensen/part3.txt · Last modified: 2012/05/09 22:03 by rhensen