Table of Contents

Part 1

Entries

September 5th, 2011

Wow! I'm a little late with getting started with my opus but I have started now and I will soon get all caught up. I'm actually entering this addition to my opus on September 19th but I did this work earlier this month. The first thing I did for this class was join the class irc. I first had to start a screen session

lab46:~$ screen 

After that I had to connect to the server

lab46:~$ irssi
[(status)] /server irc

I then had to join the Data Structures channel

[(status)]/join #data

Bam! Success… I can now happily discuss a numerous amount of topics ranging from linked lists to zombies with my fellow classmates. Can life get better? I submit that it cannot!

September 18th, 2011

After checking my class the class irc I discovered that I completely forgot to sign up to the class mailing list. I promptly fixed this by sending a request to join the class mailing list herehttp://lab46.corning-cc.edu/mailman/listinfo/data. After receiving the email I replied with the confirmation code to complete this task.

September 27th, 2011

Another day of programming, another day of fun. Today and yesterday I've been busting out keywords and the course objective. I've come across a lot of topics that have been incredibly informative. In my use of linked lists for my project I've needed an understanding of everything from pointers to memory allocation to structs. I currently have a successful programs that implements different functions that manipulate linked lists. The only function that I do not currently have working is the copy() function that is supposed to make an exact copy of a given linked list. I'm working on it now and will hopefully have a solution by the end of the month. I am also working on error handling in my functions.

September 30th, 2011

Down to the wire! Today in data structs we started to learn about stacks. Wikipedia says this about stacks “In computer science, a stack is a last in, first out (LIFO) abstract data type and data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack, or initializes the stack if it is empty. If the stack is full and does not contain enough space to accept the given item, the stack is then considered to be in an overflow state. The pop operation removes an item from the top of the stack. A pop either reveals previously concealed items, or results in an empty stack, but if the stack is empty then it goes into underflow state (It means no items are present in stack to be removed). The stack top operation gets the data from the top-most position and returns it to the user without deleting it. The same underflow state can also occur in stack top operation if stack is empty.”

To use stacks, you need to have a good understanding of linked lists. We are gonna be writing our own push and pop functions for our own stack implemtation which I am in the process of writing. As we get more into stacks I'm really looking forward to some of the projects I will be able to explore.

DATA Topics

Version Control

Version control allows you to take a snapshot of an instance of a file. This system allows you to record incremental revisions of files over a period of time. This can become very useful when working on a project and being able to review your progress at different stages.

The repository is where all of these file snapshots are stored. The difference between the repository and a typical file server is that the repository remembers each snapshot of the files. When you want to look up a file, you specify the file and the date and you can view that copy of the file.

By continually saving you work to the repository, you can save a lot of heartache when you accidentally delete/overwrite your file. If you have been faithful to updating your repository you can quickly pull up your most recent copy with minimal data loss.

The use of the repository on lab46 is quite simple. You can set up a folder for version control herehttp://lab46.corning-cc.edu/haas/spring 2011/unix/cs/cs0. After you have that completely working you can use it like this.

lab46:~/src/data$ svn add example.c
A         example.c
lab46:~/src/data$ svn commit -m "Submitted example.c"
Adding         data/example.c
Transmitting file data .....
Committed revision 40.
lab46:~/src/data$

The use of this resource is incredibly valuable!

References: http://svnbook.red-bean.com/nightly/en/svn.basic.version-control-basics.html

Pointers

A pointer references(or points to) a location in memory where value is held. When you access that data value it is called dereferencing the pointer. The following code gives an example of a simple integer pointer.

  1 #include<stdio.h>
  2
  3 int main()
  4 {
  5    int number = 10;//int variable number
  6    int *pointer = &number;//int pointer pointing to the address of number
  7    printf("number is: %d\n",number);
  8    printf("number is stored at 0x%x\n", &number);
  9    printf("pointer points to 0x%x\n", pointer);
 10    printf("The value of pointer is: %d\n", *pointer);//dereferencing a pointer
 11
 12    return 0;
 13 }

Execution of code

lab46:~/src/data$ ./pointer
number is: 10
number is stored at 0x594780d4
pointer points to 0x594780d4
The value of pointer is: 10
lab46:~/src/data$

arrays, pointer arithmetic

Pointers can not only point to single variables, but they can also point to elements of an array. The following code shows an array of size 10 and displays the memory address of each elements. The next part shows a declaration of an integer pointer and then allocates space for ten integer values. It then displays the memory addresses by using pointer arithmetic. The arithmetic is simply manipulating the memory address by adding or subtracting. In this case, we added 1 to the pointer to access the next elements location.

#include<stdio.h>
#include<stdlib.h>
int main()
{
   int array1[10];//static allocation of an array of 10 elements
   int *array2;//declaration of an integer pointer
   array2 = (int*)  malloc(sizeof(int)*10);//dynamically allocating memory for 10 integers
   int i,j = 0;
 
   for( i = 0; i < 10; i++)//This loop displays the address of the first element of the array
   {                       //and then increments i by one to move to the next location
      printf("Element %d is stored at 0x%x\n", i , &array1[i]);
   }
   printf("\n");
   for( i = 0; i < 10; i++)//This next loop uses pointer arithmetic to display the memory addresses
   {
      printf("Element %d is stored at 0x%x\n", i , array2);
      array2++;
   }
 
   return 0;
}

The following is the execution of the above code.

lab46:~/src/data$ ./example
Element 0 is stored at 0x2b5e9a0
Element 1 is stored at 0x2b5e9a4
Element 2 is stored at 0x2b5e9a8
Element 3 is stored at 0x2b5e9ac
Element 4 is stored at 0x2b5e9b0
Element 5 is stored at 0x2b5e9b4
Element 6 is stored at 0x2b5e9b8
Element 7 is stored at 0x2b5e9bc
Element 8 is stored at 0x2b5e9c0
Element 9 is stored at 0x2b5e9c4

Element 0 is stored at 0xfa4010
Element 1 is stored at 0xfa4014
Element 2 is stored at 0xfa4018
Element 3 is stored at 0xfa401c
Element 4 is stored at 0xfa4020
Element 5 is stored at 0xfa4024
Element 6 is stored at 0xfa4028
Element 7 is stored at 0xfa402c
Element 8 is stored at 0xfa4030
Element 9 is stored at 0xfa4034
lab46:~/src/data$

References: The C Programming Language , Brian Kernighan

pointers to pointers

Since a pointer is just a variable, it is possible to have a pointer pointing to another pointer. This can be useful when trying to return a pointer from a function along with many other things.

The following demonstrates how the the pointer and the pointer to the pointer point to the same point :) confused yet? just watch the execution

#include<iostream>
using namespace std;
int main()
{
   int **num1;//declaration of a double pointer
   int *num2;//declaration of a single pointer
   int num3 = 0;//setting value of num3 to 0
   num2 = &num3;//assigning num3's address to num2
   num1 = &num2;//assigning the address of the single pointer to the double pointer
 
   //now **num1, *num2 and num3 all = 0
   cout << "**num1 = " << **num1 << endl;
   cout << "*num2 = " << *num2 << endl;
   cout << "num3 = " << num3 << endl;
   return 0;
}

References: The C Programming Language , Brian Kernighan

Null pointer

“A null pointer has a value reserved for indicating that the pointer does not refer to a valid object. Null pointers are routinely used to represent conditions such as the end of a list of unknown length or the failure to perform some action; this use of null pointers can be compared to nullable types, null values in relational databases and to the Nothing value in an option type.” (Wikipedia)

Wikipedia does a much better job at explaining what a null pointer is so I will let you read about it there.

References: http://en.wikipedia.org/wiki/Pointer_%28computing%29#Null_pointer

Void pointers

Unlike integer pointers or character pointers, void pointer are generic. This means that they can store a memory address of any data type. In C the void pointer can be changed to any other pointer type upon assignment, but must be explicitly cast if dereferenced inline. Wikipedia has a nice example of this in the code here.

int x = 4;
void* q = &x;
int* p = q;  /* void* implicitly converted to int*: valid C, but not C++ */
int i = *p;
int j = *(int*)q; /* when dereferencing inline, there is no implicit conversion */

NOTE: C++ does not allow the implicit conversion from void* to other pointer types.

References: http://en.wikipedia.org/wiki/Pointer_%28computing%29#Support_in_various_programming_languages

Function pointers

Function pointers are pointers that point to function. I really have never used them before so my knowledge on them is limited, but from what I've gathered they do have some usefulness.

A normal function declaration execpting two int variables and returning an int,

int exampleFunc(int x, int y)

A pointer to function that returns an integer,

int (*exampleFunc)(int x, int y)

I also have read that functions can not be declared inside of a struct. But if you use function pointers, you can still have the same functionality by dereferencing the function pointer.

Static allocation vs. Dynamic allocation

Static memory allocation is the process of allocating memory at compile time. This is the first type of memory allocation you are taught. Dynamic memory allocation on the other hand is allocated at runtime. This memory exists until it is released manually. Dynamic Memory allocation has the ability to be a lot more efficient than static allocation.

Memory allocation (malloc(), new)

As stated previously,memory allocation can generally be done two ways, statically or dynamically. The first way is seen here.

int number;//allocates 4 bytes for the int variable number
char ch;//allocates two bytes for char variable ch
double db;//allocates 8 bytes for double variable db

Alternatively, the following code uses dynamic memory allocation to do the same thing. The C way.

int *number = NULL;
char *ch = NULL;
double *db =NULL;
 
number = malloc(sizeof(int));
ch = malloc(sizeof(char));
db = malloc(sizeof(double));

The C++ way

int *number;
char *ch;
double *db;
 
number = new int;
ch = new char;
db = new double;

The memory for these variables aren't allocated until runtime. Make sure you deallcate your memory though otherwise the FBI might show up at your front door wanting to ask you some questions.

Memory De-allocation (free(), delete)

When you dynamically allocate memory in a program you must remember to deallocate it once you are done. Using the code examples above deallocating the memory goes as follows.

The C way

free(number);
free(ch);
free(db);

The C++ way

delete number;
delete ch;
delete db;

Failure to delete memory can result in memory leaks. Lets just say you don't want this to happen. Don't dynamically allocate and forget to free memory. Only you can prevent memory leaks.

structures(structs/records)

A struct is a collectiong/grouping of variables accessed by a name. The difference between a struct and a class is that a struct can hold variables of different types. The following is a struct definition and declarationthat I used for a C++ program I wrote a while ago.

struct studentType
{
    string studentFName;
    string studentLName;
    int testScore;
    char grade;
};
studentType student;//struct variable declaration

When you declare a struct variable, enough memory is allocated to hold information for the 4 variables on the inside.

If you want to access a specific member of a struct you can do as follows.

//StructVariableName.membername
//in our example. to print the students last name you write this line
cout << "The students last name is " << student.studentLName << endl;

Another incredibly useful function of structs is in linked lists. Check out the linked list keyword for more on this awesome function.

Structure pointers

If you think of struct as just another data type(a very complex data type) then it would make sense to be able to have pointers to them. To help explain this a little better I will use a function that I have written for my project.

#include "linkedlist.h"
#include <stdlib.h>
 
struct node {
   char value;
   struct node *next;
   struct node *prev;
};
typedef struct node Node;
 
void append(Node** end, char input)
{
      Node* tmp;
      tmp = *end;
 
      tmp = (Node *) malloc (sizeof(Node));
      (*end) -> next = tmp; // tack new node onto end of list
      tmp -> prev =(*end); // new node points to current end
      (*end) = (*end) -> next; // advance end to new end
      (*end) -> value = input; // put input in node
      (*end) -> next = NULL; // nothing beyond end
}

In this function we have a struct definition that contains two struct pointers( *next and *prev). These two pointer allow us to point to the next struct/node in our linked list. We then see in the function append the declaration of a struct pointer in *tmp. Function pointers can be used for many things. In upcoming keywords you will see their use in more detail.

DATA Objectives

Write a program that uses linked lists

This course objective is to successfully write a program implementing the use of linked lists.

Method

To do this successfully, I will have to understand the basic concepts of a linked list. This requires a knowledge of pointers, structs, struct pointers, memory allocation and memory deallocation. Success will be measured by the completion of a program that implements these elements efficiently.

Measurement

The keywords that I have entered at the top of this page show that I have successfully researched the concepts required to make a working linked list. Besides having a knowledge of the concepts, the only other measure of success would be to have a program implementing these concepts. You can find such program here → Project: Doubly Linked List Library Implementation

Analysis

Upon completion of this exercise, I believe I have successfully completed the course objective of creating a program that implements a linked list.

  1. I believe could improve my functions so that they could be more flexible. For example, I currently only allow for each nodes value to be a character. I could change a few things around and allow for each node to hold more information.
  2. Since the measurement process is based upon having a knowledge of the concepts required to build a linked list and then to implement them in a program, I don't see to many ways to improve that using the current measurement system. You either have a working program or not.
  3. If you altered the course objective( Write a program that uses linked lists) to be used in a specific application such as storing employee information for a company or something like that, I can see where that could be a little more applicable. But all that aside, if you have an understanding of linked lists you can use that knowledge anywhere whatever the application.

Experiments

Experiment 1

Question

We have been using structs to build our linked lists. The question that I am posing is how does the system allocate memory for the structs we create?

Resources

Squirrel

Matt Haas

http://en.wikipedia.org/wiki/Pointer_%28computing%29

Hypothesis

I believe that the size of the struct will be the sum of its components. For example, if the struct contains a char(1 Byte) and an int(4 Bytes), the system would allocate 5 bytes of memory for that struct.

Experiment

To test this hypothesis, I will write a struct definition and a program to that uses sizeof() to output the size of the struct definition.

Data

Original

struct node1 {
   char value;//1 byte
 
};
struct node2 {
   char value;//1 byte
   int number;//4 bytes
};
typedef struct node1 Node1;
typedef struct node2 Node2;
#include<stdio.h>
int main()
{
 
   char v1 = 0;
   int v2 = 0;
 
   printf("char is %d bytes\n", sizeof(v1));
   printf("int is %d bytes\n", sizeof(v2));
   printf("node1 is %d bytes\n", sizeof(Node1));
   printf("node2 is %d bytes\n", sizeof(Node2));
 
   return 0;                                                                                                                                                  
}

Execution of the code

lab46:~/src/data$ ./sizelab46:~/src/data$ ./size
char is 1 bytes
int is 4 bytes
node1 is 1 bytes
node2 is 8 bytes
lab46:~/src/data$

The first struct contains a single character and takes up one byte in memory. The second node contains a char and an int(5 bytes) yet it takes up 8 bytes in memory.

Let's try this again but with some struct pointers.

struct node1 {
   char value;
   struct node1 *tmp1;
};
struct node2 {
   char value;
   int number;
   struct node2 *tmp1;
   struct node2 *tmp2;
   struct node2 *tmp3;
};
typedef struct node1 Node1;
typedef struct node2 Node2;

The code above now declares a struct pointer in the first struct and two struct pointers in the second. With some simple math you can surmise that the first structs components add up to 9 bytes and the second structs components add up to 29 bytes. Let's see how the system will allocate memory.

lab46:~/src/data$ ./size
char is 1 bytes
int is 4 bytes
node1 is 16 bytes
node2 is 32 bytes
lab46:~/src/data$

Although the first structs components add up to 9 bytes the system allocates 16 bytes of memory for the struct. Also, the second structs components add up to 29 bytes, the system allocates 32 bytes.

Analysis

Based on the data collected:

Conclusions

After performing this experiment, I can conclude that when the system allocates memory for a struct it will pad the memory to the nearest multiple of 8 bytes. According to my sources, this has to do with system performance and optimization. I believe this experiment performed on a different system may not always provide the exact same result.

Experiment 2

Question

This is a gonna be a short experiment but since we've been dealing with linked lists so much I thought this might be helpful. How much more memory does a linked list of integers take up than an array.

Resources

The C Programming Language , Brian Kernighan

Hypothesis

I think the linked list will take up substantially more memory than the array.

Experiment

To test his I will write a short program creating an array of 10000 integers. I will also create a struct that contains a int variable. I will find the size of that and then multiply the value by 10000 and then compare the two results.

Data

This is a short program that will examine the memory allocating of two different data structures.

#include<stdio.h>
struct node1 {
   int value;
   struct node1 *tmp1;
};
typedef struct node1 Node1;
int main()
{
   int array[10000];
   int i;
   for (i = 0; i < 10000; i++)
   {
      array[i] = i;
   }
 
   printf("node1 is %d bytes\n", (sizeof(Node1)*10000));
   printf("array is %d bytes\n", (sizeof(array[0])*10000));
 
   return 0;
}

Execution of code.

lab46:~/src/data$ ./size
node1 is 160000 bytes
array is 40000 bytes
lab46:~/src/data$

Analysis

Based on the data collected:

Conclusions

The purpose of this experiment was to show that when going for efficiency, using a array could be more efficient in some cases. The linked list took up four times the memory than the array. The downside is that an array must be allocated statically where as a linked list can be allocated dynamically.

Experiment 3

Question

How much memory is a user on lab46 allowed to use.

Resources

http://en.wikipedia.org/wiki/Malloc

Hypothesis

Since dynamic memory is allocated at runtime, the program should compile fine. But once the program is run, the system will eventually run out of memory to allocate. I really haven't the slightest idea how much memory each user can have.

Experiment

I will create an infinite loop with 10 malloc statements in it and have a counter in the loop. The body of the function will allocate memory of size int(4 bytes) ten times per iteration. I will then multiply the final result by 4 to see the total memory allowance per user.

Data

Test Code.

#include<stdio.h>
#include<stdlib.h>
 
int main()
{
   int x = 0;
   int a,b,c,d,e,f,g,h,i,j;
   int iterations = 0;
   while (x == 0)
   {
 
      a = (int)malloc(sizeof(int));
      b = (int)malloc(sizeof(int));
      c = (int)malloc(sizeof(int));
      d = (int)malloc(sizeof(int));
      e = (int)malloc(sizeof(int));
      f = (int)malloc(sizeof(int));
      g = (int)malloc(sizeof(int));
      h = (int)malloc(sizeof(int));
      i = (int)malloc(sizeof(int));
      j = (int)malloc(sizeof(int));
      iterations = iterations + 10;
      printf("iterations: %d\n",iterations);
   }
   return 0;
}

Execution

lab46:~$./malloc
iterations: 1
.
.
.
.
.
.
iterations: 38220610
iterations: 38220620
iterations: 38220630
iterations: 38220640
iterations: 38220650
iterations: 38220660
iterations: 38220670
Killed
lab46:~/src/data$

I ran the command ps during the runtime of this program and got this under memory usage for .malloc

lab46:~$ ps
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
dschoeff  1152  0.0  0.0  13660    12 pts/7    SNs  Sep16   0:00 /bin/bash
dschoeff  1155  0.0  0.1  42544  1428 pts/7    SN+  Sep16   1:59 irssi
dschoeff  3547  0.0  0.0  13632   864 pts/47   SNs  21:42   0:00 -bash
dschoeff 19715  0.0  0.1  13624  1488 pts/45   SNs  23:43   0:00 -bash
dschoeff 20561  9.0 80.0 1054040 836352 pts/47 SN+  23:49   0:32 ./malloc
dschoeff 21328  0.0  0.0   8584  1028 pts/45   RN+  23:55   0:00 ps u
lab46:~$ 

Notice the 80% memory usage

Analysis

Based on the data collected:

Conclusions

Based upon the final number of iterations, I can conclude that each user is allowed around 150 megabytes of memory( being that 1 megabyte is approx 1,000,000 bytes). It will be interesting to find out how close this number is to the actuall number or if I am completely off :)