User Tools

Site Tools


haas:spring2011:unix:labs:lab8


Corning Community College


UNIX/Linux Fundamentals



Lab 0x8: The UNIX Programming Environment

~~TOC~~

Objective

Exploration of the development environment tools available in UNIX operating systems.

Reference

Background

C and UNIX

Soon after UNIX was created in its first assembly language implementation on that “little-used” PDP-7, there became a need for it to be portable, so it could run on other machines (including the PDP-11). An attempt was made in the language called B, but it didn't prove to be useful. Soon thereafter C was implemented and UNIX was rewritten in that language.

Since then, C and UNIX have been tightly related to one another- UNIX facilities either appear very “C-like” or are easily accessible with C. In fact, the typical UNIX kernel is exclusively C, aside from the architecture-specific assembly language needed to interface with the basic hardware operations.

Because of this, UNIX is a very portable operating system- and as thirty years of existance has shown, a port of some form or another is found on nearly every computer architecture.

Terminology

  • Source Code: plain ASCII text with the programming language's structures and grammar, and comments.
  • Assembly Code: plain ASCII text containing the assembly mneumonics equivalent of your source code. (There exists a one-to-many relationship between a programming language command and the corresponding assembly functionality).
  • Object Code: binary file containing the machine code equivalent of the assembly mneumonics. (There is a direct one-to-one relationship between an assembly mneumonic and corresponding machine instruction). This code by itself is not executable as it still relies on functionality from the various system libraries.
  • Machine Code: the final product of the compilation process. This is in machine language- and is in the form that the computer can natively understand it.
  • Architecture: implementation of machine code and supporting processor. Each different processor has different instructions and therefore machine language is not portable between different architectures/processors. Examples of architectures include: SPARC, AXP (Alpha), MIPS, x86 (Intel compatible), PowerPC/G3/G4/G5, m68k, and many others…
  • Platform: implementation of a computer architecture. Machine Language for platforms of an identical architecture are the same, but the system may be constructed in a different manner to make most functionality incompatible. This is also a bit fuzzy- as it is sometimes referred to regarding hardware and othertimes as software.

Examples of hardware platforms: Sun's SPARCstations, Silicon Graphic's Indigo2 workstations, m68k Macintosh vs. m68k Amiga.

And if twisted properly, software platforms: Microsoft Windows 9x, Microsoft Windows NT, Linux, OpenBSD, MacOS X, etc.

The issue of binary compatibility comes into play– where an executable on a Sun SPARC system is unable to run natively on an Amiga, and vice versa. Same with trying to natively run an x86 Linux binary on an x86 Windows system. It simply does not know how to execute the code when it is out of its particular environment.

The Programming Model

The basic programming model of going from source code (what you write) to machine code (what the machine understands) will typically be as follows:

  1. Source Code - higher-level english-like statements you write
  2. Run the Compiler
    • Compiling Phase Begins …
  3. Lexical Analyzer - breaks your code into pieces, or “tokens”.
  4. Syntax Analyzer - checks tokens for correctness.
  5. Intermediate Code Generator - convert from high-level into something lower
  6. Semantic Analyzer - checks for detectable logic problems
  7. Optimization - check for patterns that can be simplified
  8. Code Generatioa - create assembly code
    • … Compiling Phase Ends.
  9. Assembler - translates assembly code into system-specific Object Code
  10. Linker - links the Object Code with other Object Code from System Libraries
    • Finally: Machine Executable Code

For most intents and purposes, the source code is portable among architectures. If the language is high enough level, it will abstract away and not depend on specific details of the underlying hardware.

For this lab we will become familar with the UNIX Programming Environment. The system is very diverse, and as such is host to a number of different programming languages and tools. Perhaps some of the common languages you will encounter on a UNIX system are: C, C++, Assembly, and Java. For the purposes of this lab, we will focus almost exclusively on C, with some detail paid to C++ and Assembly Language. Sample programs will be created, and the development tools such as the make utility will be explored.

The GNU C compiler

To compile a single source file, you would do the following:

lab46:~$ gcc -o binary source.c

Where binary is the name of the desired executable, and source.c is the name of the text file containing the program source.

The -o argument to gcc indicates the name of the output file. Since there is only one file in this case, the compiler automatically performs the assembling, and linking steps for us.

The GNU C++ compiler

To compile a single source file using C++, you would do the following:

lab46:~$ g++ -o binary source.cc

Where binary is the name of the desired executable, and source.cc is the name of the text file containing the program source.

g++ is basically a wrapper to the main GNU compiler, so options like -o are identical between the C and C++ compilers.

The GNU assembler

Unlike higher-level source code like C or C++, assembly language corresponds to the low-level implementation of a particular machine. Because of this, assembly language (like machine language), is machine dependent. This means that an assembly language program written for an x86 machine will not work on an Alpha machine. Not only will the binaries be incompatible (as described above), but the source code will not be compatible.

Assembly Language affords you that extra level of flexibility and control over your programs that may be required in extreme circumstances. In any event, being familiar with assembly language will only help to improve your knowledge of the computer, and to write better high-level language programs.

To assemble a single assembly file, you would do the following:

lab46:~$ as -o object.o assembly.s

Where object.o is the name of the desired object file, and assembly.s is the name of the source text file containing the assembly code.

This will actually leave your code in an intermediate step– in order to be able to run it on a system, you will need to use the system's linker to load symbols from the appropriate system libraries. To do that you would do at least the following:

lab46:~$ ld -o binary object.o

Procedure

Create a new directory in your home directory called devel/ and do the work for the next several steps in this directory.

In the devel/ subdirectory of the UNIX Public Directory, you will find three files: helloC.c, helloCPP.cc, and helloASM.S (or helloASM2.asm). Copy these to your local directory to work on in this lab.

Compile

1. Do the following:
a.Compile each of the source and assembly files into binary executables.
b.Show me how you accomplished this.
c.Verify that they all indeed work.

Make sure you compile each file to a uniquely named executable, so you can compare the three of them.

NOTE: If you encounter any warnings or errors during the compile, chances are there is a typo in your source code. Please include the exact message you receive (and source code) if reporting problems on the class mailing list or IRC.

Execute

2. Do the following:
a.At the lab46:~$ prompt, type: helloC
b.Do you get an error? If so, what is it?
c.At the lab46:~$ prompt, type: ./helloC
d.Does this work? Explain why this makes a difference, compared to the prior entry?

Putting it together, piece by piece

Now that you've compiled and ran the program, let's see what we take for granted– each individual step.

We'll be picking apart the procedure for compiling and linking hello.c, but similar methods are employed when dealing with C++ and other programming languages.

3. Focus on helloC.c; Do the following:
a.What type of file does helloC.c appear to be?
b.What does file(1) say about it?
c.How about the other source files?

So, let's start taking a tour of all these various steps involved in compilation. First up, we need to convert the source to assembly.

4. Do the following:
a.Do a directory listing to take a quick inventory of files present.
b.At the lab46:~/devel$ prompt type gcc -S helloC.c
c.Do a directory listing. Anything new?
d.What type of file is it? (See what file(1) says about it)
e.At the lab46:~/devel$ prompt use cat(1) to display the contents of the new file.
f.What is it that you are looking at? How does it relate to helloC.c?

So, now that we've got it converted to assembly, let's assemble it.

5. Do the following:
a.Do a directory listing to take a quick inventory of files present.
b.At the lab46:~/devel$ prompt type as -o hello.o helloC.s
c.Do a directory listing. Anything new?
d.What type of file is it? (See what file(1) says about it)
e.What is it that you are looking at? How does it relate to helloC.s?

So, almost there.. we've got the program in object form.. just need to link it with system libraries. Usually we'd just feed it to “ld”, the linker, but since this the code makes use of the C library, we've got to include a couple libraries. To save you lots of typing, I've thrown together a little script that has all this information typed out for you.

To complete this next task, please make sure you copy the link.sh script from the devel/ subdirectory of the UNIX Public Directory into your devel/ directory.

6. Do the following:
a.Copy the “link.sh” script from the devel/ subdirectory of the UNIX Public Directory.
b.View this script.
c.See that long “ld” line? That's what the compiler is doing for you.
d.Go ahead and run this script, following the instructions in it, to link together your final executable.

Now, you should hopefully have an executable.

7. Do the following:
a.At the “lab46:~/devel$” prompt type: file hello
b.What type of file is it?
c.At the “lab46:~/devel$” prompt type: file helloC
d.Does the output match that of the previous?
e.Go ahead and execute your new binary. Does it run? Show me what you typed and what happens.

Makefiles

A very popular tool used in program development is make. This tool comes in very handy when dealing with multiple source files that need compiling (and determining whether or not you need to recompile a particular object file).

It works by allowing the programmer to set up dependencies between the source and object files with a series of rules pertinent for the particular project. These rules are often places in a file called Makefile

Every account on Lab46 is equipped with a customized Makefile in the src/ subdirectory to the home directory.

8. Do the following:
a.Copy helloC.c into your src/ subdirectory. How did you do it?
b.Do a directory listing. Do you see at least Makefile and helloC.c?
c.View the contents of Makefile
d.Compile helloC.c with the Makefile. How did you do this?
9. Do the following:
a.In /var/public/unix/devel there is a subdirectory called “multifile/”. Copy this directory (and its contents) to your own devel/ in your home directory. How did you do this?
b.View the various files in this directory, try and trace the flow of logic between them.
c.Read through the Makefile, and determine how to build this code.
d.How did you do this?

Code Efficiency: comparing file sizes

And interesting benchmark that can be conducted is to create programs that perform identical operations, and to compare the resulting file sizes and execution times of the executables.

10. Looking back on the original helloC, helloCPP, and helloASM binaries, do the following:
a.What size are each of the executables?
b.What observations can you make regarding differences in file size or execution speed?
c.View/compile helloJAVA.java and run the result (java helloJAVA). What is its size?
d.With Java being a higher level language (as C++ is, when compared to C and assembly), what do you think about the resulting compiled file? Is there perhaps more here than meets the eye?

Conclusions

All questions in this assignment require an action or response. Please organize your responses into an easily readable format and submit the final results to your instructor.

Your assignment is expected to be performed and submitted in a clear and organized fashion- messy or unorganized assignments may have points deducted. Be sure to adhere to the submission policy.

The successful results of the following actions will be considered for evaluation:

  • your responses to questions submitted at the following form:

<html><center><a href=“http://lab46.corning-cc.edu/haas/content/unix/submit.php?lab8”>http://lab46.corning-cc.edu/haas/content/unix/submit.php?lab8</a></center></html>

  • the response from the form (received via e-mail) saved as lab8.txt to your ~/src/unix/ directory
  • addition/commit of ~/src/unix/lab8.txt into your repository (CS 0x0 sets you up to do this).

As always, the class mailing list and class IRC channel are available for assistance, but not answers.

haas/spring2011/unix/labs/lab8.txt · Last modified: 2011/03/26 18:24 by 127.0.0.1