Table of Contents
Part 1
Entries
September 5th, 2011
Wow! I'm a little late with getting started with my opus but I have started now and I will soon get all caught up. I'm actually entering this addition to my opus on September 19th but I did this work earlier this month. The first thing I did for this class was join the class irc. I first had to start a screen session
lab46:~$ screen
After that I had to connect to the server
lab46:~$ irssi [(status)] /server irc
I then had to join the Data Structures channel
[(status)]/join #data
Bam! Success… I can now happily discuss a numerous amount of topics ranging from linked lists to zombies with my fellow classmates. Can life get better? I submit that it cannot!
September 18th, 2011
After checking my class the class irc I discovered that I completely forgot to sign up to the class mailing list. I promptly fixed this by sending a request to join the class mailing list herehttp://lab46.corning-cc.edu/mailman/listinfo/data. After receiving the email I replied with the confirmation code to complete this task.
September 27th, 2011
Another day of programming, another day of fun. Today and yesterday I've been busting out keywords and the course objective. I've come across a lot of topics that have been incredibly informative. In my use of linked lists for my project I've needed an understanding of everything from pointers to memory allocation to structs. I currently have a successful programs that implements different functions that manipulate linked lists. The only function that I do not currently have working is the copy() function that is supposed to make an exact copy of a given linked list. I'm working on it now and will hopefully have a solution by the end of the month. I am also working on error handling in my functions.
September 30th, 2011
Down to the wire! Today in data structs we started to learn about stacks. Wikipedia says this about stacks “In computer science, a stack is a last in, first out (LIFO) abstract data type and data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack, or initializes the stack if it is empty. If the stack is full and does not contain enough space to accept the given item, the stack is then considered to be in an overflow state. The pop operation removes an item from the top of the stack. A pop either reveals previously concealed items, or results in an empty stack, but if the stack is empty then it goes into underflow state (It means no items are present in stack to be removed). The stack top operation gets the data from the top-most position and returns it to the user without deleting it. The same underflow state can also occur in stack top operation if stack is empty.”
To use stacks, you need to have a good understanding of linked lists. We are gonna be writing our own push and pop functions for our own stack implemtation which I am in the process of writing. As we get more into stacks I'm really looking forward to some of the projects I will be able to explore.
DATA Topics
Version Control
Version control allows you to take a snapshot of an instance of a file. This system allows you to record incremental revisions of files over a period of time. This can become very useful when working on a project and being able to review your progress at different stages.
The repository is where all of these file snapshots are stored. The difference between the repository and a typical file server is that the repository remembers each snapshot of the files. When you want to look up a file, you specify the file and the date and you can view that copy of the file.
By continually saving you work to the repository, you can save a lot of heartache when you accidentally delete/overwrite your file. If you have been faithful to updating your repository you can quickly pull up your most recent copy with minimal data loss.
The use of the repository on lab46 is quite simple. You can set up a folder for version control herehttp://lab46.corning-cc.edu/haas/spring 2011/unix/cs/cs0. After you have that completely working you can use it like this.
lab46:~/src/data$ svn add example.c A example.c lab46:~/src/data$ svn commit -m "Submitted example.c" Adding data/example.c Transmitting file data ..... Committed revision 40. lab46:~/src/data$
The use of this resource is incredibly valuable!
References: http://svnbook.red-bean.com/nightly/en/svn.basic.version-control-basics.html
Pointers
A pointer references(or points to) a location in memory where value is held. When you access that data value it is called dereferencing the pointer. The following code gives an example of a simple integer pointer.
1 #include<stdio.h> 2 3 int main() 4 { 5 int number = 10;//int variable number 6 int *pointer = &number;//int pointer pointing to the address of number 7 printf("number is: %d\n",number); 8 printf("number is stored at 0x%x\n", &number); 9 printf("pointer points to 0x%x\n", pointer); 10 printf("The value of pointer is: %d\n", *pointer);//dereferencing a pointer 11 12 return 0; 13 }
Execution of code
lab46:~/src/data$ ./pointer number is: 10 number is stored at 0x594780d4 pointer points to 0x594780d4 The value of pointer is: 10 lab46:~/src/data$
arrays, pointer arithmetic
Pointers can not only point to single variables, but they can also point to elements of an array. The following code shows an array of size 10 and displays the memory address of each elements. The next part shows a declaration of an integer pointer and then allocates space for ten integer values. It then displays the memory addresses by using pointer arithmetic. The arithmetic is simply manipulating the memory address by adding or subtracting. In this case, we added 1 to the pointer to access the next elements location.
#include<stdio.h> #include<stdlib.h> int main() { int array1[10];//static allocation of an array of 10 elements int *array2;//declaration of an integer pointer array2 = (int*) malloc(sizeof(int)*10);//dynamically allocating memory for 10 integers int i,j = 0; for( i = 0; i < 10; i++)//This loop displays the address of the first element of the array { //and then increments i by one to move to the next location printf("Element %d is stored at 0x%x\n", i , &array1[i]); } printf("\n"); for( i = 0; i < 10; i++)//This next loop uses pointer arithmetic to display the memory addresses { printf("Element %d is stored at 0x%x\n", i , array2); array2++; } return 0; }
The following is the execution of the above code.
lab46:~/src/data$ ./example Element 0 is stored at 0x2b5e9a0 Element 1 is stored at 0x2b5e9a4 Element 2 is stored at 0x2b5e9a8 Element 3 is stored at 0x2b5e9ac Element 4 is stored at 0x2b5e9b0 Element 5 is stored at 0x2b5e9b4 Element 6 is stored at 0x2b5e9b8 Element 7 is stored at 0x2b5e9bc Element 8 is stored at 0x2b5e9c0 Element 9 is stored at 0x2b5e9c4 Element 0 is stored at 0xfa4010 Element 1 is stored at 0xfa4014 Element 2 is stored at 0xfa4018 Element 3 is stored at 0xfa401c Element 4 is stored at 0xfa4020 Element 5 is stored at 0xfa4024 Element 6 is stored at 0xfa4028 Element 7 is stored at 0xfa402c Element 8 is stored at 0xfa4030 Element 9 is stored at 0xfa4034 lab46:~/src/data$
References: The C Programming Language , Brian Kernighan
pointers to pointers
Since a pointer is just a variable, it is possible to have a pointer pointing to another pointer. This can be useful when trying to return a pointer from a function along with many other things.
The following demonstrates how the the pointer and the pointer to the pointer point to the same point :) confused yet? just watch the execution
#include<iostream> using namespace std; int main() { int **num1;//declaration of a double pointer int *num2;//declaration of a single pointer int num3 = 0;//setting value of num3 to 0 num2 = &num3;//assigning num3's address to num2 num1 = &num2;//assigning the address of the single pointer to the double pointer //now **num1, *num2 and num3 all = 0 cout << "**num1 = " << **num1 << endl; cout << "*num2 = " << *num2 << endl; cout << "num3 = " << num3 << endl; return 0; }
References: The C Programming Language , Brian Kernighan
Null pointer
“A null pointer has a value reserved for indicating that the pointer does not refer to a valid object. Null pointers are routinely used to represent conditions such as the end of a list of unknown length or the failure to perform some action; this use of null pointers can be compared to nullable types, null values in relational databases and to the Nothing value in an option type.” (Wikipedia)
Wikipedia does a much better job at explaining what a null pointer is so I will let you read about it there.
References: http://en.wikipedia.org/wiki/Pointer_%28computing%29#Null_pointer
Void pointers
Unlike integer pointers or character pointers, void pointer are generic. This means that they can store a memory address of any data type. In C the void pointer can be changed to any other pointer type upon assignment, but must be explicitly cast if dereferenced inline. Wikipedia has a nice example of this in the code here.
int x = 4; void* q = &x; int* p = q; /* void* implicitly converted to int*: valid C, but not C++ */ int i = *p; int j = *(int*)q; /* when dereferencing inline, there is no implicit conversion */
NOTE: C++ does not allow the implicit conversion from void* to other pointer types.
References: http://en.wikipedia.org/wiki/Pointer_%28computing%29#Support_in_various_programming_languages
Function pointers
Function pointers are pointers that point to function. I really have never used them before so my knowledge on them is limited, but from what I've gathered they do have some usefulness.
A normal function declaration execpting two int variables and returning an int,
int exampleFunc(int x, int y)
A pointer to function that returns an integer,
int (*exampleFunc)(int x, int y)
I also have read that functions can not be declared inside of a struct. But if you use function pointers, you can still have the same functionality by dereferencing the function pointer.
Static allocation vs. Dynamic allocation
Static memory allocation is the process of allocating memory at compile time. This is the first type of memory allocation you are taught. Dynamic memory allocation on the other hand is allocated at runtime. This memory exists until it is released manually. Dynamic Memory allocation has the ability to be a lot more efficient than static allocation.
Memory allocation (malloc(), new)
As stated previously,memory allocation can generally be done two ways, statically or dynamically. The first way is seen here.
int number;//allocates 4 bytes for the int variable number char ch;//allocates two bytes for char variable ch double db;//allocates 8 bytes for double variable db
Alternatively, the following code uses dynamic memory allocation to do the same thing. The C way.
int *number = NULL; char *ch = NULL; double *db =NULL; number = malloc(sizeof(int)); ch = malloc(sizeof(char)); db = malloc(sizeof(double));
The C++ way
int *number; char *ch; double *db; number = new int; ch = new char; db = new double;
The memory for these variables aren't allocated until runtime. Make sure you deallcate your memory though otherwise the FBI might show up at your front door wanting to ask you some questions.
Memory De-allocation (free(), delete)
When you dynamically allocate memory in a program you must remember to deallocate it once you are done. Using the code examples above deallocating the memory goes as follows.
The C way
free(number); free(ch); free(db);
The C++ way
delete number; delete ch; delete db;
Failure to delete memory can result in memory leaks. Lets just say you don't want this to happen. Don't dynamically allocate and forget to free memory. Only you can prevent memory leaks.
structures(structs/records)
A struct is a collectiong/grouping of variables accessed by a name. The difference between a struct and a class is that a struct can hold variables of different types. The following is a struct definition and declarationthat I used for a C++ program I wrote a while ago.
struct studentType { string studentFName; string studentLName; int testScore; char grade; }; studentType student;//struct variable declaration
When you declare a struct variable, enough memory is allocated to hold information for the 4 variables on the inside.
If you want to access a specific member of a struct you can do as follows.
//StructVariableName.membername //in our example. to print the students last name you write this line cout << "The students last name is " << student.studentLName << endl;
Another incredibly useful function of structs is in linked lists. Check out the linked list keyword for more on this awesome function.
Structure pointers
If you think of struct as just another data type(a very complex data type) then it would make sense to be able to have pointers to them. To help explain this a little better I will use a function that I have written for my project.
#include "linkedlist.h" #include <stdlib.h> struct node { char value; struct node *next; struct node *prev; }; typedef struct node Node; void append(Node** end, char input) { Node* tmp; tmp = *end; tmp = (Node *) malloc (sizeof(Node)); (*end) -> next = tmp; // tack new node onto end of list tmp -> prev =(*end); // new node points to current end (*end) = (*end) -> next; // advance end to new end (*end) -> value = input; // put input in node (*end) -> next = NULL; // nothing beyond end }
In this function we have a struct definition that contains two struct pointers( *next and *prev). These two pointer allow us to point to the next struct/node in our linked list. We then see in the function append the declaration of a struct pointer in *tmp. Function pointers can be used for many things. In upcoming keywords you will see their use in more detail.
DATA Objectives
Write a program that uses linked lists
This course objective is to successfully write a program implementing the use of linked lists.
Method
To do this successfully, I will have to understand the basic concepts of a linked list. This requires a knowledge of pointers, structs, struct pointers, memory allocation and memory deallocation. Success will be measured by the completion of a program that implements these elements efficiently.
Measurement
The keywords that I have entered at the top of this page show that I have successfully researched the concepts required to make a working linked list. Besides having a knowledge of the concepts, the only other measure of success would be to have a program implementing these concepts. You can find such program here → Project: Doubly Linked List Library Implementation
Analysis
Upon completion of this exercise, I believe I have successfully completed the course objective of creating a program that implements a linked list.
- I believe could improve my functions so that they could be more flexible. For example, I currently only allow for each nodes value to be a character. I could change a few things around and allow for each node to hold more information.
- Since the measurement process is based upon having a knowledge of the concepts required to build a linked list and then to implement them in a program, I don't see to many ways to improve that using the current measurement system. You either have a working program or not.
- If you altered the course objective( Write a program that uses linked lists) to be used in a specific application such as storing employee information for a company or something like that, I can see where that could be a little more applicable. But all that aside, if you have an understanding of linked lists you can use that knowledge anywhere whatever the application.
Experiments
Experiment 1
Question
We have been using structs to build our linked lists. The question that I am posing is how does the system allocate memory for the structs we create?
Resources
Hypothesis
I believe that the size of the struct will be the sum of its components. For example, if the struct contains a char(1 Byte) and an int(4 Bytes), the system would allocate 5 bytes of memory for that struct.
Experiment
To test this hypothesis, I will write a struct definition and a program to that uses sizeof() to output the size of the struct definition.
Data
Original
struct node1 { char value;//1 byte }; struct node2 { char value;//1 byte int number;//4 bytes }; typedef struct node1 Node1; typedef struct node2 Node2;
#include<stdio.h> int main() { char v1 = 0; int v2 = 0; printf("char is %d bytes\n", sizeof(v1)); printf("int is %d bytes\n", sizeof(v2)); printf("node1 is %d bytes\n", sizeof(Node1)); printf("node2 is %d bytes\n", sizeof(Node2)); return 0; }
Execution of the code
lab46:~/src/data$ ./sizelab46:~/src/data$ ./size char is 1 bytes int is 4 bytes node1 is 1 bytes node2 is 8 bytes lab46:~/src/data$
The first struct contains a single character and takes up one byte in memory. The second node contains a char and an int(5 bytes) yet it takes up 8 bytes in memory.
Let's try this again but with some struct pointers.
struct node1 { char value; struct node1 *tmp1; }; struct node2 { char value; int number; struct node2 *tmp1; struct node2 *tmp2; struct node2 *tmp3; }; typedef struct node1 Node1; typedef struct node2 Node2;
The code above now declares a struct pointer in the first struct and two struct pointers in the second. With some simple math you can surmise that the first structs components add up to 9 bytes and the second structs components add up to 29 bytes. Let's see how the system will allocate memory.
lab46:~/src/data$ ./size char is 1 bytes int is 4 bytes node1 is 16 bytes node2 is 32 bytes lab46:~/src/data$
Although the first structs components add up to 9 bytes the system allocates 16 bytes of memory for the struct. Also, the second structs components add up to 29 bytes, the system allocates 32 bytes.
Analysis
Based on the data collected:
- My original hypothesis stated that the size of the struct would be the sum of its components. This was not completely accurate.
- The hypothesis was correct in assuming that the size of struct was related to the sum of its components but the memory allocated was padded to the next multiple of 8 bytes.
- Since I performed this experiment on lab46(64 bit), this experiment might have different results if run a different system(32 bit possibly). This all depends on how it would allocate the memory.
Conclusions
After performing this experiment, I can conclude that when the system allocates memory for a struct it will pad the memory to the nearest multiple of 8 bytes. According to my sources, this has to do with system performance and optimization. I believe this experiment performed on a different system may not always provide the exact same result.
Experiment 2
Question
This is a gonna be a short experiment but since we've been dealing with linked lists so much I thought this might be helpful. How much more memory does a linked list of integers take up than an array.
Resources
The C Programming Language , Brian Kernighan
Hypothesis
I think the linked list will take up substantially more memory than the array.
Experiment
To test his I will write a short program creating an array of 10000 integers. I will also create a struct that contains a int variable. I will find the size of that and then multiply the value by 10000 and then compare the two results.
Data
This is a short program that will examine the memory allocating of two different data structures.
#include<stdio.h> struct node1 { int value; struct node1 *tmp1; }; typedef struct node1 Node1; int main() { int array[10000]; int i; for (i = 0; i < 10000; i++) { array[i] = i; } printf("node1 is %d bytes\n", (sizeof(Node1)*10000)); printf("array is %d bytes\n", (sizeof(array[0])*10000)); return 0; }
Execution of code.
lab46:~/src/data$ ./size node1 is 160000 bytes array is 40000 bytes lab46:~/src/data$
Analysis
Based on the data collected:
- My hypothesis was correct. The Linked list took up substantially more memory then the array.
- The experiment was pretty straight forward and to the point.
Conclusions
The purpose of this experiment was to show that when going for efficiency, using a array could be more efficient in some cases. The linked list took up four times the memory than the array. The downside is that an array must be allocated statically where as a linked list can be allocated dynamically.
Experiment 3
Question
How much memory is a user on lab46 allowed to use.
Resources
Hypothesis
Since dynamic memory is allocated at runtime, the program should compile fine. But once the program is run, the system will eventually run out of memory to allocate. I really haven't the slightest idea how much memory each user can have.
Experiment
I will create an infinite loop with 10 malloc statements in it and have a counter in the loop. The body of the function will allocate memory of size int(4 bytes) ten times per iteration. I will then multiply the final result by 4 to see the total memory allowance per user.
Data
Test Code.
#include<stdio.h> #include<stdlib.h> int main() { int x = 0; int a,b,c,d,e,f,g,h,i,j; int iterations = 0; while (x == 0) { a = (int)malloc(sizeof(int)); b = (int)malloc(sizeof(int)); c = (int)malloc(sizeof(int)); d = (int)malloc(sizeof(int)); e = (int)malloc(sizeof(int)); f = (int)malloc(sizeof(int)); g = (int)malloc(sizeof(int)); h = (int)malloc(sizeof(int)); i = (int)malloc(sizeof(int)); j = (int)malloc(sizeof(int)); iterations = iterations + 10; printf("iterations: %d\n",iterations); } return 0; }
Execution
lab46:~$./malloc iterations: 1 . . . . . . iterations: 38220610 iterations: 38220620 iterations: 38220630 iterations: 38220640 iterations: 38220650 iterations: 38220660 iterations: 38220670 Killed lab46:~/src/data$
I ran the command ps during the runtime of this program and got this under memory usage for .malloc
lab46:~$ ps USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND dschoeff 1152 0.0 0.0 13660 12 pts/7 SNs Sep16 0:00 /bin/bash dschoeff 1155 0.0 0.1 42544 1428 pts/7 SN+ Sep16 1:59 irssi dschoeff 3547 0.0 0.0 13632 864 pts/47 SNs 21:42 0:00 -bash dschoeff 19715 0.0 0.1 13624 1488 pts/45 SNs 23:43 0:00 -bash dschoeff 20561 9.0 80.0 1054040 836352 pts/47 SN+ 23:49 0:32 ./malloc dschoeff 21328 0.0 0.0 8584 1028 pts/45 RN+ 23:55 0:00 ps u lab46:~$
Notice the 80% memory usage
Analysis
Based on the data collected:
- My hypothesis was correct. The system finally ran out of memory.
- is there more going on than you originally thought? When I was running the program and running the ps command in another terminal, I noticed at first a steady rise in memory usage. After a while though. I noticed memory usage go down a bit before it began to rise again. This seemed a little stange.
- what shortcomings might there be in your experiment? I'm sure there may have been an easier way to find out how much memory each user is allowed( maybe ask Matt) but this seemed so much more entertaining.
Conclusions
Based upon the final number of iterations, I can conclude that each user is allowed around 150 megabytes of memory( being that 1 megabyte is approx 1,000,000 bytes). It will be interesting to find out how close this number is to the actuall number or if I am completely off :)