Dave Schoeffler's Fall 2011 Opus
:)
Hey all. My name is Dave Schoeffler but most people call me by my nickname Sheffy. I am a Computer Science major at Corning Community College. This is my 5th semester at CCC and 4th being full time. This is my second to last semester here at Corning Community College where upon completion I will be transferring to a yet to be determined 4 year school. My plan is to get a Bachelors in Computer Science with a Minor in Math and possible go on to get a Master's in Computer Science, but we will see how long I can put up with school :).
I've enjoyed my experience at Corning so far, mainly because of my awesome teachers. Last semester I took UNIX/Linux with Matt and really enjoyed learning a new operating system. I look forward to using that knowledge to be a better programmer. I currently have my own lawn mowing business to pay the bills but I can't wait to be able to get some computer based job. I am not sure what job field I would like to get into yet but I know I wanna work with computers.
What else can I tell you about myself? Well I play guitar and like most any sport (except for soccer because its not really a sport). I'll be turning 19 pretty soon and am really looking forward to this class. I'm looking forward to expanding my current knowledge of C/C++ to be able to write some wicked cool programs.
Wow! I'm a little late with getting started with my opus but I have started now and I will soon get all caught up. I'm actually entering this addition to my opus on September 19th but I did this work earlier this month. The first thing I did for this class was join the class irc. I first had to start a screen session
lab46:~$ screen
After that I had to connect to the server
lab46:~$ irssi [(status)] /server irc
I then had to join the Data Structures channel
[(status)]/join #data
Bam! Success… I can now happily discuss a numerous amount of topics ranging from linked lists to zombies with my fellow classmates. Can life get better? I submit that it cannot!
After checking my class the class irc I discovered that I completely forgot to sign up to the class mailing list. I promptly fixed this by sending a request to join the class mailing list herehttp://lab46.corning-cc.edu/mailman/listinfo/data. After receiving the email I replied with the confirmation code to complete this task.
Another day of programming, another day of fun. Today and yesterday I've been busting out keywords and the course objective. I've come across a lot of topics that have been incredibly informative. In my use of linked lists for my project I've needed an understanding of everything from pointers to memory allocation to structs. I currently have a successful programs that implements different functions that manipulate linked lists. The only function that I do not currently have working is the copy() function that is supposed to make an exact copy of a given linked list. I'm working on it now and will hopefully have a solution by the end of the month. I am also working on error handling in my functions.
Down to the wire! Today in data structs we started to learn about stacks. Wikipedia says this about stacks “In computer science, a stack is a last in, first out (LIFO) abstract data type and data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack, or initializes the stack if it is empty. If the stack is full and does not contain enough space to accept the given item, the stack is then considered to be in an overflow state. The pop operation removes an item from the top of the stack. A pop either reveals previously concealed items, or results in an empty stack, but if the stack is empty then it goes into underflow state (It means no items are present in stack to be removed). The stack top operation gets the data from the top-most position and returns it to the user without deleting it. The same underflow state can also occur in stack top operation if stack is empty.”
To use stacks, you need to have a good understanding of linked lists. We are gonna be writing our own push and pop functions for our own stack implemtation which I am in the process of writing. As we get more into stacks I'm really looking forward to some of the projects I will be able to explore.
Version control allows you to take a snapshot of an instance of a file. This system allows you to record incremental revisions of files over a period of time. This can become very useful when working on a project and being able to review your progress at different stages.
The repository is where all of these file snapshots are stored. The difference between the repository and a typical file server is that the repository remembers each snapshot of the files. When you want to look up a file, you specify the file and the date and you can view that copy of the file.
By continually saving you work to the repository, you can save a lot of heartache when you accidentally delete/overwrite your file. If you have been faithful to updating your repository you can quickly pull up your most recent copy with minimal data loss.
The use of the repository on lab46 is quite simple. You can set up a folder for version control herehttp://lab46.corning-cc.edu/haas/spring 2011/unix/cs/cs0. After you have that completely working you can use it like this.
lab46:~/src/data$ svn add example.c A example.c lab46:~/src/data$ svn commit -m "Submitted example.c" Adding data/example.c Transmitting file data ..... Committed revision 40. lab46:~/src/data$
The use of this resource is incredibly valuable!
References: http://svnbook.red-bean.com/nightly/en/svn.basic.version-control-basics.html
A pointer references(or points to) a location in memory where value is held. When you access that data value it is called dereferencing the pointer. The following code gives an example of a simple integer pointer.
1 #include<stdio.h> 2 3 int main() 4 { 5 int number = 10;//int variable number 6 int *pointer = &number;//int pointer pointing to the address of number 7 printf("number is: %d\n",number); 8 printf("number is stored at 0x%x\n", &number); 9 printf("pointer points to 0x%x\n", pointer); 10 printf("The value of pointer is: %d\n", *pointer);//dereferencing a pointer 11 12 return 0; 13 }
Execution of code
lab46:~/src/data$ ./pointer number is: 10 number is stored at 0x594780d4 pointer points to 0x594780d4 The value of pointer is: 10 lab46:~/src/data$
Pointers can not only point to single variables, but they can also point to elements of an array. The following code shows an array of size 10 and displays the memory address of each elements. The next part shows a declaration of an integer pointer and then allocates space for ten integer values. It then displays the memory addresses by using pointer arithmetic. The arithmetic is simply manipulating the memory address by adding or subtracting. In this case, we added 1 to the pointer to access the next elements location.
#include<stdio.h> #include<stdlib.h> int main() { int array1[10];//static allocation of an array of 10 elements int *array2;//declaration of an integer pointer array2 = (int*) malloc(sizeof(int)*10);//dynamically allocating memory for 10 integers int i,j = 0; for( i = 0; i < 10; i++)//This loop displays the address of the first element of the array { //and then increments i by one to move to the next location printf("Element %d is stored at 0x%x\n", i , &array1[i]); } printf("\n"); for( i = 0; i < 10; i++)//This next loop uses pointer arithmetic to display the memory addresses { printf("Element %d is stored at 0x%x\n", i , array2); array2++; } return 0; }
The following is the execution of the above code.
lab46:~/src/data$ ./example Element 0 is stored at 0x2b5e9a0 Element 1 is stored at 0x2b5e9a4 Element 2 is stored at 0x2b5e9a8 Element 3 is stored at 0x2b5e9ac Element 4 is stored at 0x2b5e9b0 Element 5 is stored at 0x2b5e9b4 Element 6 is stored at 0x2b5e9b8 Element 7 is stored at 0x2b5e9bc Element 8 is stored at 0x2b5e9c0 Element 9 is stored at 0x2b5e9c4 Element 0 is stored at 0xfa4010 Element 1 is stored at 0xfa4014 Element 2 is stored at 0xfa4018 Element 3 is stored at 0xfa401c Element 4 is stored at 0xfa4020 Element 5 is stored at 0xfa4024 Element 6 is stored at 0xfa4028 Element 7 is stored at 0xfa402c Element 8 is stored at 0xfa4030 Element 9 is stored at 0xfa4034 lab46:~/src/data$
References: The C Programming Language , Brian Kernighan
Since a pointer is just a variable, it is possible to have a pointer pointing to another pointer. This can be useful when trying to return a pointer from a function along with many other things.
The following demonstrates how the the pointer and the pointer to the pointer point to the same point :) confused yet? just watch the execution
#include<iostream> using namespace std; int main() { int **num1;//declaration of a double pointer int *num2;//declaration of a single pointer int num3 = 0;//setting value of num3 to 0 num2 = &num3;//assigning num3's address to num2 num1 = &num2;//assigning the address of the single pointer to the double pointer //now **num1, *num2 and num3 all = 0 cout << "**num1 = " << **num1 << endl; cout << "*num2 = " << *num2 << endl; cout << "num3 = " << num3 << endl; return 0; }
References: The C Programming Language , Brian Kernighan
“A null pointer has a value reserved for indicating that the pointer does not refer to a valid object. Null pointers are routinely used to represent conditions such as the end of a list of unknown length or the failure to perform some action; this use of null pointers can be compared to nullable types, null values in relational databases and to the Nothing value in an option type.” (Wikipedia)
Wikipedia does a much better job at explaining what a null pointer is so I will let you read about it there.
References: http://en.wikipedia.org/wiki/Pointer_%28computing%29#Null_pointer
Unlike integer pointers or character pointers, void pointer are generic. This means that they can store a memory address of any data type. In C the void pointer can be changed to any other pointer type upon assignment, but must be explicitly cast if dereferenced inline. Wikipedia has a nice example of this in the code here.
int x = 4; void* q = &x; int* p = q; /* void* implicitly converted to int*: valid C, but not C++ */ int i = *p; int j = *(int*)q; /* when dereferencing inline, there is no implicit conversion */
NOTE: C++ does not allow the implicit conversion from void* to other pointer types.
References: http://en.wikipedia.org/wiki/Pointer_%28computing%29#Support_in_various_programming_languages
Function pointers are pointers that point to function. I really have never used them before so my knowledge on them is limited, but from what I've gathered they do have some usefulness.
A normal function declaration execpting two int variables and returning an int,
int exampleFunc(int x, int y)
A pointer to function that returns an integer,
int (*exampleFunc)(int x, int y)
I also have read that functions can not be declared inside of a struct. But if you use function pointers, you can still have the same functionality by dereferencing the function pointer.
Static memory allocation is the process of allocating memory at compile time. This is the first type of memory allocation you are taught. Dynamic memory allocation on the other hand is allocated at runtime. This memory exists until it is released manually. Dynamic Memory allocation has the ability to be a lot more efficient than static allocation.
As stated previously,memory allocation can generally be done two ways, statically or dynamically. The first way is seen here.
int number;//allocates 4 bytes for the int variable number char ch;//allocates two bytes for char variable ch double db;//allocates 8 bytes for double variable db
Alternatively, the following code uses dynamic memory allocation to do the same thing. The C way.
int *number = NULL; char *ch = NULL; double *db =NULL; number = malloc(sizeof(int)); ch = malloc(sizeof(char)); db = malloc(sizeof(double));
The C++ way
int *number; char *ch; double *db; number = new int; ch = new char; db = new double;
The memory for these variables aren't allocated until runtime. Make sure you deallcate your memory though otherwise the FBI might show up at your front door wanting to ask you some questions.
When you dynamically allocate memory in a program you must remember to deallocate it once you are done. Using the code examples above deallocating the memory goes as follows.
The C way
free(number); free(ch); free(db);
The C++ way
delete number; delete ch; delete db;
Failure to delete memory can result in memory leaks. Lets just say you don't want this to happen. Don't dynamically allocate and forget to free memory. Only you can prevent memory leaks.
A struct is a collectiong/grouping of variables accessed by a name. The difference between a struct and a class is that a struct can hold variables of different types. The following is a struct definition and declarationthat I used for a C++ program I wrote a while ago.
struct studentType { string studentFName; string studentLName; int testScore; char grade; }; studentType student;//struct variable declaration
When you declare a struct variable, enough memory is allocated to hold information for the 4 variables on the inside.
If you want to access a specific member of a struct you can do as follows.
//StructVariableName.membername //in our example. to print the students last name you write this line cout << "The students last name is " << student.studentLName << endl;
Another incredibly useful function of structs is in linked lists. Check out the linked list keyword for more on this awesome function.
If you think of struct as just another data type(a very complex data type) then it would make sense to be able to have pointers to them. To help explain this a little better I will use a function that I have written for my project.
#include "linkedlist.h" #include <stdlib.h> struct node { char value; struct node *next; struct node *prev; }; typedef struct node Node; void append(Node** end, char input) { Node* tmp; tmp = *end; tmp = (Node *) malloc (sizeof(Node)); (*end) -> next = tmp; // tack new node onto end of list tmp -> prev =(*end); // new node points to current end (*end) = (*end) -> next; // advance end to new end (*end) -> value = input; // put input in node (*end) -> next = NULL; // nothing beyond end }
In this function we have a struct definition that contains two struct pointers( *next and *prev). These two pointer allow us to point to the next struct/node in our linked list. We then see in the function append the declaration of a struct pointer in *tmp. Function pointers can be used for many things. In upcoming keywords you will see their use in more detail.
This course objective is to successfully write a program implementing the use of linked lists.
To do this successfully, I will have to understand the basic concepts of a linked list. This requires a knowledge of pointers, structs, struct pointers, memory allocation and memory deallocation. Success will be measured by the completion of a program that implements these elements efficiently.
The keywords that I have entered at the top of this page show that I have successfully researched the concepts required to make a working linked list. Besides having a knowledge of the concepts, the only other measure of success would be to have a program implementing these concepts. You can find such program here → Project: Doubly Linked List Library Implementation
Upon completion of this exercise, I believe I have successfully completed the course objective of creating a program that implements a linked list.
We have been using structs to build our linked lists. The question that I am posing is how does the system allocate memory for the structs we create?
I believe that the size of the struct will be the sum of its components. For example, if the struct contains a char(1 Byte) and an int(4 Bytes), the system would allocate 5 bytes of memory for that struct.
To test this hypothesis, I will write a struct definition and a program to that uses sizeof() to output the size of the struct definition.
Original
struct node1 { char value;//1 byte }; struct node2 { char value;//1 byte int number;//4 bytes }; typedef struct node1 Node1; typedef struct node2 Node2;
#include<stdio.h> int main() { char v1 = 0; int v2 = 0; printf("char is %d bytes\n", sizeof(v1)); printf("int is %d bytes\n", sizeof(v2)); printf("node1 is %d bytes\n", sizeof(Node1)); printf("node2 is %d bytes\n", sizeof(Node2)); return 0; }
Execution of the code
lab46:~/src/data$ ./sizelab46:~/src/data$ ./size char is 1 bytes int is 4 bytes node1 is 1 bytes node2 is 8 bytes lab46:~/src/data$
The first struct contains a single character and takes up one byte in memory. The second node contains a char and an int(5 bytes) yet it takes up 8 bytes in memory.
Let's try this again but with some struct pointers.
struct node1 { char value; struct node1 *tmp1; }; struct node2 { char value; int number; struct node2 *tmp1; struct node2 *tmp2; struct node2 *tmp3; }; typedef struct node1 Node1; typedef struct node2 Node2;
The code above now declares a struct pointer in the first struct and two struct pointers in the second. With some simple math you can surmise that the first structs components add up to 9 bytes and the second structs components add up to 29 bytes. Let's see how the system will allocate memory.
lab46:~/src/data$ ./size char is 1 bytes int is 4 bytes node1 is 16 bytes node2 is 32 bytes lab46:~/src/data$
Although the first structs components add up to 9 bytes the system allocates 16 bytes of memory for the struct. Also, the second structs components add up to 29 bytes, the system allocates 32 bytes.
Based on the data collected:
After performing this experiment, I can conclude that when the system allocates memory for a struct it will pad the memory to the nearest multiple of 8 bytes. According to my sources, this has to do with system performance and optimization. I believe this experiment performed on a different system may not always provide the exact same result.
This is a gonna be a short experiment but since we've been dealing with linked lists so much I thought this might be helpful. How much more memory does a linked list of integers take up than an array.
The C Programming Language , Brian Kernighan
I think the linked list will take up substantially more memory than the array.
To test his I will write a short program creating an array of 10000 integers. I will also create a struct that contains a int variable. I will find the size of that and then multiply the value by 10000 and then compare the two results.
This is a short program that will examine the memory allocating of two different data structures.
#include<stdio.h> struct node1 { int value; struct node1 *tmp1; }; typedef struct node1 Node1; int main() { int array[10000]; int i; for (i = 0; i < 10000; i++) { array[i] = i; } printf("node1 is %d bytes\n", (sizeof(Node1)*10000)); printf("array is %d bytes\n", (sizeof(array[0])*10000)); return 0; }
Execution of code.
lab46:~/src/data$ ./size node1 is 160000 bytes array is 40000 bytes lab46:~/src/data$
Based on the data collected:
The purpose of this experiment was to show that when going for efficiency, using a array could be more efficient in some cases. The linked list took up four times the memory than the array. The downside is that an array must be allocated statically where as a linked list can be allocated dynamically.
How much memory is a user on lab46 allowed to use.
Since dynamic memory is allocated at runtime, the program should compile fine. But once the program is run, the system will eventually run out of memory to allocate. I really haven't the slightest idea how much memory each user can have.
I will create an infinite loop with 10 malloc statements in it and have a counter in the loop. The body of the function will allocate memory of size int(4 bytes) ten times per iteration. I will then multiply the final result by 4 to see the total memory allowance per user.
Test Code.
#include<stdio.h> #include<stdlib.h> int main() { int x = 0; int a,b,c,d,e,f,g,h,i,j; int iterations = 0; while (x == 0) { a = (int)malloc(sizeof(int)); b = (int)malloc(sizeof(int)); c = (int)malloc(sizeof(int)); d = (int)malloc(sizeof(int)); e = (int)malloc(sizeof(int)); f = (int)malloc(sizeof(int)); g = (int)malloc(sizeof(int)); h = (int)malloc(sizeof(int)); i = (int)malloc(sizeof(int)); j = (int)malloc(sizeof(int)); iterations = iterations + 10; printf("iterations: %d\n",iterations); } return 0; }
Execution
lab46:~$./malloc iterations: 1 . . . . . . iterations: 38220610 iterations: 38220620 iterations: 38220630 iterations: 38220640 iterations: 38220650 iterations: 38220660 iterations: 38220670 Killed lab46:~/src/data$
I ran the command ps during the runtime of this program and got this under memory usage for .malloc
lab46:~$ ps USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND dschoeff 1152 0.0 0.0 13660 12 pts/7 SNs Sep16 0:00 /bin/bash dschoeff 1155 0.0 0.1 42544 1428 pts/7 SN+ Sep16 1:59 irssi dschoeff 3547 0.0 0.0 13632 864 pts/47 SNs 21:42 0:00 -bash dschoeff 19715 0.0 0.1 13624 1488 pts/45 SNs 23:43 0:00 -bash dschoeff 20561 9.0 80.0 1054040 836352 pts/47 SN+ 23:49 0:32 ./malloc dschoeff 21328 0.0 0.0 8584 1028 pts/45 RN+ 23:55 0:00 ps u lab46:~$
Notice the 80% memory usage
Based on the data collected:
Based upon the final number of iterations, I can conclude that each user is allowed around 150 megabytes of memory( being that 1 megabyte is approx 1,000,000 bytes). It will be interesting to find out how close this number is to the actuall number or if I am completely off :)
So I'm a little late making my first entry for this month(okay a lot late) but I have been working on stuff for data structures. What I have done so far this month is complete my doubly linked list library implementation project. I had few hiccups with that but I was able to work through them and move on. Since then I have started two more projects. One of them deals with a stack implementation and the other deals with a queue implementation. The stack project has given me a few small little problems but I am currently working through them and will hopefully get that working soon. I've really been enjoying the stuff that I've been learning and hope to get an even better understanding as the weeks go by.
I'm back dating this entry because I forgot to add it on Monday. Today I worked on my Stack project. I was able to work through the problems I was having and got two out of the three main function working. It ended up being stupid compiling mistake that I made and now I have them working. I have also been studying up on the queue library implemetation and hope to have that nearly complete by the end of the month. I'm really enjoying these projects that I'm doing and finding it fun to be able to do a lot of the exploring on my own.
SNOW!!!! It snowed today up here on the hill and that was majorly depressing. Nothing like a full day of homework and then snow to brighten up your day. Despite being depressed I was able to make really good progress on my keywords today. I was able to define stacks and all the operation that go along with them. I find that having to write down what you know about the keywords really makes you think about how they actually function. I feel like I have a better understanding of them now that I have been able to write them down. I also found out that the project proposal that I had for my stack library didn't save and all the work I did was for nothing. Oh well, I guess I'll have to write it up again.
So far today I have finally written up my project proposal for Matt and got that sent to him. I did it a couple days ago but for some reason it didn't save and I lost all my work. I am also currently working on writing up my queue library implementation and will send that to Matt soon as well.
And I learned to always have one of Karl's nuts in your hand at all times!
Wikipedia defines a linked list as this,“a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference (in other words, a link) to the next node in the sequence; more complex variants add additional links. This structure allows for efficient insertion or removal of elements from any position in the sequence.”Each “node” of a linked list is a struct containing some data value along with a struct pointer to the next node.
You may ask yourself, “why would I use a linked list to store data when I could just use an array?”. This is a very good question. The advantages to using a linked to store data are as follows.
The insertion of data into a specific location in an array would require a lot of work. Deletion is easy but causes vacant spots in memory and hinders optimal performance. Also, when wanting to add data to an array, you would need to reallocate memory which is an expensive operation. You do not have any of these problems with a linked list. All you have to do manipulate a few pointers and bam! You are set.
below is an example of the definition of a node structure, creation of a singly linked list and the displaying of a lists contents. The credit for this code goes to Matthew Haas.
#include<stdio.h> #include<stdlib.h> struct node { int value; struct node *next; }; typedef struct node Node; int main() { Node *start, *tmp; int input,i; start = tmp = NULL; printf("Enter a value -1 to exit: "); scanf("%d", &input); while( input != -1) { if(start == NULL) { start = (Node *)malloc(sizeof(Node)); start->next=NULL; tmp = start; start->value=input; } else { tmp->next = (Node *)malloc(sizeof(Node)); tmp = tmp->next; tmp->next = NULL; tmp->value=input; } printf("Enter a value -1 to exit: "); scanf("%d", &input); } printf("The list contains the following elements\n"); tmp = start; while (tmp != NULL) { printf("[%d]value = %d\n", i, tmp -> value); tmp = tmp -> next; i++; } return 0; }
lab46:~/src/data$ ./linkedlist Enter a value -1 to exit: 1 Enter a value -1 to exit: 2 Enter a value -1 to exit: 3 Enter a value -1 to exit: 4 Enter a value -1 to exit: 5 Enter a value -1 to exit: 6 Enter a value -1 to exit: -1 The list contains the following elements [0]value = 1 [1]value = 2 [2]value = 3 [3]value = 4 [4]value = 5 [5]value = 6 lab46:~/src/data$
References http://en.wikipedia.org/wiki/Linked_list#Linked_lists_vs._dynamic_arrays
The only difference between a singly linked list and a doubly linked list is that in a doubly linked list, the node structure also has a pointer to the previous node. This slight difference though gives a lot more functionality to the list. Now you are able to traverse the list in either direction where as in a singly linked list you had to start at the beginning. This can be useful in many applications.
The code below shows the declaration of a node that has two struct pointers
struct node { int value; struct node *next; struct node *prev; }; typedef struct node Node;
A stack is a last in first out data structure. The type of stacks we covered in class were linked list based. There are also array based stacks but they have their own set of limitations. A stack is simply a linked list that can only be accessed from the top. This is know as a last in first out data structure(more on that later).
The following struct “stack” contains the the data needed to implement a stack. Each Stack has a node pointer pointing to the top and bottom of the stack. It also contains a node pointer tmp and an int variable count which can be used for various other needs.
struct stack { int count; struct node *top; struct node *data; struct node *tmp; }; typedef struct stack Stack;
The creation of a stack is also very similar to that of a linked list. The creation of a stack is typically done the push() function which will be explained below.
Pushing is the operation that is used to add a node to a stack. The function is typically called push() and it allows you to push a node on top of the stack. There is no other way to add a value to a stack but by using this function. The push() function should also be able to initialize a stack if there is not a stack to add the value to. Finally, the push() function should also check for stack overflow. More about that later.
the following code demonstrates a push function.
/*********************************************************** * Function push() will create a stack if no stack currently* * exists. If a stack does exist, push() will add a node to * * the top of the stack. It will then return the updated * * stack. * * Compile with: gcc -c push.c * * * ***********************************************************/ #include "stack.h" #include<stdio.h> #include<stdlib.h> Stack *push(Stack *start, char input) { Node *tmp; if(start == NULL)// if no stack, create one { start = (Stack *) malloc (sizeof(Stack)); start -> data = malloc(sizeof(Node)); start -> top = malloc(sizeof(Node)); start -> tmp = malloc(sizeof(Node)); start -> top -> value = input; start -> top -> next = NULL; start -> data = start -> top; } else//if there is a stack, add node to top of it { tmp = (Node *) malloc (sizeof(Node)); tmp -> value = input; tmp -> next = start -> top; start -> top = tmp; } return (start); }
popping is the opposite of pushing. Popping allows you to remove a node from a stack. It is important to realize that in a stack, you only have access to the top most node. If you would like to remove a node that is not on the top, you have to remove all the nodes on top of the one you want to remove. That may seem stupid but it is the nature of a stack. Typically the function used to pop nodes is called pop(). A good pop function will also check for stack underflow(I'll explain more about that later).
The following code demonstrates a pop() functions.
/****************************************************************** * Function pop() accepts as an argument a stack. It will then pop * * the node that is currently on the top and make the node under * * that the new top. It will then return the value that from the * * popped node as a character. If the stack that is passed is NULL * * (meaning it does not exist), the function cannot pop a node that* * doesn't exist. Therefore, the function will return the null * * terminating character. * * * * Compile with: gcc -c pop.c * * * **************************************************************** */ #include "stack.h" #include <stdlib.h> char pop(Stack *start) { char chartmp; start -> tmp = start -> top; if(start -> tmp != start -> data) { start -> top = start -> top -> next; start -> tmp -> next = NULL; chartmp = start -> tmp -> value; } else if(start -> data == NULL) { chartmp = '\n'; } else { chartmp = start -> top -> value; start -> top = start -> tmp = start -> data = NULL; } return(chartmp); }
The top operation is a very simple one. the top() ( or peek() ) function allows you to view the contents of the node that is currently on top of the stack. That is it. It just returns whatever value that is on the stack. It doesn't manipulate the nodes or change them in any way.
The following shows the execution of my stack implementation.
lab46:~/src/data/stackProject$ ./stack Enter - 1 - to pop a Node off of the stack Enter - 2 - to push a Node on the stack Enter - 3 - to peek at the value on top of the stack Enter - q - to quit 2 Enter value for new node: a Enter - 1 - to pop a Node off of the stack Enter - 2 - to push a Node on the stack Enter - 3 - to peek at the value on top of the stack Enter - q - to quit 3 The top node contains the value: a Enter - 1 - to pop a Node off of the stack Enter - 2 - to push a Node on the stack Enter - 3 - to peek at the value on top of the stack Enter - q - to quit 2 Enter value for new node: b Enter - 1 - to pop a Node off of the stack Enter - 2 - to push a Node on the stack Enter - 3 - to peek at the value on top of the stack Enter - q - to quit 3 The top node contains the value: b Enter - 1 - to pop a Node off of the stack Enter - 2 - to push a Node on the stack Enter - 3 - to peek at the value on top of the stack Enter - q - to quit q lab46:~/src/data/stackProject$
Notice that the most recent node pushed onto the stack becomes the top. If the top node is popped, then the node before that now becomes the top.
Stack overflow occurs when too much memory is used on the stack. Wikipedia states “The size of the call stack depends on many factors, including the programming language, machine architecture, multi-threading, and amount of available memory. When a program attempts to use more space than is available on the call stack (that is, when it attempts to access memory beyond the call stack's bounds, which is essentially a buffer overflow), the stack is said to overflow, typically resulting in a program crash”(Wikipedia). Basically stack overflow is not a good thing and your push() function should be able to detect and handle that condition if it comes up. When using a array based stack, the maximum size for your stack is the size of the array. Let say you have an array of 100 elements.
example)
int testArray[100]; for(int i = 0; i< 200; i++) { testArray[i] = i; }
this loop loads values into an array.
Execution of this for loop would occur in a stack overflow because you would be accessing values out side of the given space. When you have a linked list based stack, overflow occurs when you run out of memory.
When you use the pop() function on a stack but the stack is empty, you go into what is called a state of underflow. This means that there is no value to pop. If this happens, zombies will come out of the computer monitor and try to eat your soul…. NOT GOOD! Seriouly, a stack underflow will result in a runtime error in your program. Your pop() function should check to see if the stack is empty before trying to pop a node.
The following shows what a program should do when trying to pop a node off of a stack when there is no node to pop.
lab46:~/src/data/stackProject$ ./stack Enter - 1 - to pop a Node off of the stack Enter - 2 - to push a Node on the stack Enter - 3 - to peek at the value on top of the stack Enter - q - to quit 2 Enter value for new node: a Enter - 1 - to pop a Node off of the stack Enter - 2 - to push a Node on the stack Enter - 3 - to peek at the value on top of the stack Enter - q - to quit 1 The value that was pooped is: a Enter - 1 - to pop a Node off of the stack Enter - 2 - to push a Node on the stack Enter - 3 - to peek at the value on top of the stack Enter - q - to quit 1 The value that was pooped is: Enter - 1 - to pop a Node off of the stack Enter - 2 - to push a Node on the stack Enter - 3 - to peek at the value on top of the stack Enter - q - to quit q lab46:~/src/data/stackProject$
The value 'a' was pushed onto the stack and then popped off. This made the stack contain no value. I then tried to run the pop() function. Instead of getting a segmentation fault the pop() function returned a null value indication that there was no value to pop.
LIFO stands for “last in, first out”. This describes what we have just been talking about in stacks.
Here is a picture representing a stack.
The red sheet here represents the top of the stack. The red sheet the last sheet added to the stack and will be the first one out if the pop() function were to be run. I hope this helps your understanding of LIFO stack.
A queue ( pronounced as the letter 'Q') is a FIFO data structure(more about FIFO later). It is basically a linked list that you can only leave from the end or join from the beginning. Think of a line at the verizon. You go into the store and reserve your spot in line. Doing this places you into a queue. You will not be able to leave the queue until all the people in front of you have been waited on. In the mean time more people have been added to the queue behind you as well. Several hours later you will finally be able to leave the queue and be waited on. Like a linked list, queues are commonly made up of linked nodes. They are identical to singly linked lists and have a number of useful applications.
please refer to this link for the basic structure of a queue → http://lab46.corning-cc.edu/opus/fall2011/dschoeff/start?&#linked_lists
Like in a stack, only certain operations can be performed on a queue. The first one I will mention is enqueue. enqueue() is function that allows you to add a node to the beginning of the queue. This is very similar to a prepend() function for a linked list. This operation takes a node and adds it to the beginning of a queue therefore making that new node the first node in the queue. That node can not be removed until it becomes the last node it the queue.
The following code demonstrates a enqueue operation.
//compile with: gcc -c enqueue.c // //Function enqueue() enqueues anode to the queue. If there is no queue in existance //it will create one. This function is almost exactly like the insert function in my //doubly linked library. #include "queue.h" #include <stdlib.h> void enqueue(Node** start, Node** end, int index, char ch) { int i; Node* tmp; Node* tmp2; tmp = *start; if (*start == NULL) // if our queue is empty { (*start) = (Node *) malloc (sizeof(Node)); (*end) = tmp = *(start); (*start) -> prev = NULL; (*end) -> prev = NULL; (*start) -> value = ch; } else// we are enqueing a node { tmp2 = (Node *) malloc (sizeof(Node)); tmp2 -> next = (*start); // point new node's next at (*start) (*start) -> prev = tmp2; // point (*start)'s prev at tmp2 (*start) = (*start) -> prev; // move (*start) to new node (*start) -> value = ch; // put value in node } }
queues would be pretty useless if you could only add nodes to the beginning of the queue. This is where dequeuing comes in. dequeue() is a function that allows you to remove the node on the very end of the queue. This is the only way that you can remove a node in a queue. The dequeue() function is very similar to pop() for stacks and removeNode() for linked lists. They operate basically the same way.
The following demonstrates the dequeuing operation.
//compile with: gcc -c dequeue.c // // Function dequeue() removes a node from a given queue and frees the // memory for that node #include "linkedlist.h" #include <stdlib.h> char dequeue(Node** start, Node** end, int index) { int i; char ch; Node* tmp; tmp = *start; if((*start) != NULL) { // Move tmp to exact node we wish to delete for( i=0; i<index; i++) { tmp = tmp -> next; } ch = tmp -> value; if ((tmp != *start) && (tmp != *end)) { tmp -> next -> prev = tmp -> prev; // The node after points to the node before tmp tmp -> prev -> next = tmp -> next; // Now, the now before points to the node after tmp tmp -> prev = NULL; tmp -> next = NULL; // tmp is now disconnected from the queue free (tmp); // deallocate the node tmp is pointing to } else if (*start == *end) // queue is only a single node { free(tmp); (*start) = (*end) = NULL; } else if (tmp == *start) { tmp -> next -> prev = NULL; // The node following tmp is no longer pointing to us (*start) = (*start) -> next; tmp -> next = NULL; // Disconnect tmp from the queue free(tmp); // deallocate the node tmp is pointing to } else // tmp is equal to end { tmp -> prev -> next = NULL; // The node preceding tmp is no longer pointing to us (*end) = (*end) -> prev; // Adjust end to point to the new end of the queue tmp -> prev = NULL; // Disconnect tmp from the queue free(tmp); // deallocate the node tmp is pointing to } } else { ch = '\n'; } return(ch); }
This course objective is to successfully write a program implementing the use of stacks.
To do this successfully, I will have to understand the basic concepts of a stack. This requires a knowledge of pointers, structs, struct pointers, memory allocation and memory deallocation. It also requires a knowledge of linked lists. Success will be measured by the completion of a program that implements these elements efficiently.
The keywords that I have entered at the top of this page show that I have successfully researched the concepts required to make a working stack. Besides having a knowledge of the concepts, the only other measure of success would be to have a program implementing these concepts. You can find such program here → http://lab46.corning-cc.edu/user/dschoeff/portfolio/libstack
Upon completion of this exercise, I believe I have successfully completed the course objective of creating a program that implements a linked list.
How does the system handle unexpected input. Such as a character when an integer is expected.
I believe that when you enter a character when an integer is expected the program will output an error and terminate.
This will happen because the system does not know what to do with the character value.
I will test my hypothesis by creating a program that excepts as input an int value. I will then input a character value and see if the program will run.
Here is the code for my experiment.
#include<iostream> using namespace std; int main() { int num ; cout << "Enter integer: "; cin >> num; cout << " Int is " << num << endl; return 0; }
Here is the execution
lab46:~/src/data$ ./inttest Enter integer: 1 Int is 1 lab46:~/src/data$ ./inttest Enter integer: q Int is 0 lab46:~/src/data$
Based on the data collected:
I can conclude from this experiment that when using cin to get an integer value, and then a character value is entered, the program will still run without error. Though the program will run without error, the output you get will not be correct. I found that when entering a character value when an int value was expected, the value stored in the variable is 0. BE CAREFUL WITH YOUR INPUT!
How much time does the system take to do add up to a billion 1 at a time.
Matthew Haas The man page for clock
I think that it will take 1 minute to add up to a billion.
The computer is very powerful and can perform calculation at a crazy fast speed.
To test this, I will write a function that add one to a variable until it reaches a billion. I will also include the time.h header file so I can measure how long it takes for the program to run.
The following program will allow me to test the runtime of the program.
#include<time.h> #include<stdio.h> #include<stdlib.h> int main() { unsigned long int i = 0; unsigned long int abc = clock(); while(i < 1000000000) { i++; } abc = clock(); abc = abc/CLOCKS_PER_SEC; printf("time in seconds = %llu\n",abc); return (0); }
The execution of the program.
lab46:~/src/data$ ./clock time in seconds = 4 lab46:~/src/data$
Based on the data collected:
After completing the experiment, I feel I have a better understanding of how fast the processor can handle tasks. This experiment also made me become familiar with using the time.h header file with its built-in function clock().
This semester I am taking Java, a higher level programming language than C, and have learned that with Java there is more “overhead”. The question I am posing is this. Does more overhead equate to slower running programs?
I believe that since Java has more overhead and is a higher level language than C, it will take longer to do the same task.
I will test my hypothesis by writing a program in both C and Java That does the same basic task. I will then compile and run these programs recording how long it took each one to run to completion. I will then compare the results with my original hypothesis.
I have written two programs that increment a double variable by .025 until it reaches one billion.
In C
int main() { double i = 0; while(i < 1000000000) { i = i + .025;; } return (0); }
In Java
public class Cvsjava { public static void main(String[] args) { double i = 0; while(i < 1000000000) { i = i + .025; } } }
Execution of C code
lab46:~/src/data$ time ./clock real 4m13.026s user 4m12.956s sys 0m0.036s lab46:~/src/data$
Notice the total time is 4 minutes and 13 seconds
Execution of Java code
lab46:~/src/data$ time java Cvsjava real 3m47.261s user 3m47.230s sys 0m0.076s lab46:~/src/data$
Notice the total time is 3 minutes and 47 seconds.
Based on the data collected:
According to findings, I think I need to do a lot more research on this topic. I find it very interesting that the Java program(being a higher level language) seemed to run faster than the one written in C. I would also like to find out if other factors are effecting how the programs run(like the JVM or something like that). I would like to investigate this more and get more opinions as to what is going on behind the scenes.
As I continue to program in both languages I find pro's and con's to both. I look forward to learning more about these two popular programming languages.
Happy November! This being my first official dated entry of the new month brings along great news! I have officially completed my stack library implementation and it is awesome. You should definitely check it out here → Project: Stack Library Implementation. I've learned so much doing this stack implementation. I ran into a lot of problems doing this partly because I didn't have a good understanding of stacks going into the project. This led me to research about stacks more and increase my knowledge of how they worked. I don't think I could've done this project without the use of the debugger gdb. It came in so handy when trying to find out why I was getting a segmentation fault.
I still want to increase my knowledge of struct pointers because having a better understanding of them will help lead to better code in the future. I see the power of linked list type data structures and can't wait to explore more of them.
Well today has been fairly productive. I successfully completed my queue library implementation. I also submitted all my files to the subversion repository. You can view my queue project here → http://lab46.corning-cc.edu/user/dschoeff/portfolio/libqueue. I was able to work through a few problems I was having with my dequeue() function, but I stopped by the lair today and Matt helped me out. My next thing I want to work on is my keywords and experiments. I hope to get these finished up soon.
Today I decided I would go do corrections to my opus. I had completed my opus on time and everything but I didn't do that good of a job of demonstrating some of the keywords I defined. So I went back and tried to better demonstrate them with code snippets and programs. I also included some command line execution of programs I had written. I think the keywords can be understood a lot better now with the new demonstrations.
So I'm back dating this entry cause I forgot to do it on Friday.
Today was pretty cool. In Data Structures we started talking about various sorting algorithms. What we did was basically look on Wikipedia at the different sorting algorithms out there. Each type of sorting algorithm has its pros and cons but they all basically do they same thing which is to sort data. When choosing a sorting algorithm to use you need to take into account your own specific implementation and what it will be used for.
After learning about these things in class Matt let us loose and had us write our own simple algorithm. I chose to write a bubble sort algorithm which is not the most efficient algorithm but is simple to understand. I plan on writing a few more sorting algorithms implementing different methods.
After writing this algorithm for a project I decided to do some experiments on how efficient the algorithms really were. It nice to see for your self how efficient the algorithms are at sorting data rather than just taking someone else's word for it.
When referring to queues, an overrun condition occurs when the queue exceeds the space available to store data. In a array based queue, this would happen when trying to add a value to an array that is all ready full. Since I haven't dealt much with array based queue implementations I will stick mainly to explaining linked list based implementations. With a linked list based queue, it would seem to me that an overrun condition would be much harder to get. When you think about linked lists, they do not have a fixed size. The only size limitation that the list size has(unless the program designer were to arbitrarily set a size limit for the list) is how much memory is available to the system. Although the chance of overrun is somewhat unlikely, a function that enqueues a node to a queue should check for this condition and be flexible enough to handle that situation. I have done an experiment dealing with system memory management here → http://lab46.corning-cc.edu/opus/fall2011/dschoeff/start#experiment_3. This experiment explores dynamic memory allocation and how lab46 deals with alot of memory usage.
Look at the following is a snippet of my enqueue function. If the system that you were running your program on ran out of memory, a problem would occur at line 26 when you try to allocate memory for a new node when there is no more memory available.
26 tmp2 = (Node *) malloc (sizeof(Node)); 27 tmp2 -> next = (*start); // point new node's next at (*start) 28 (*start) -> prev = tmp2; // point (*start)'s prev at tmp2 29 (*start) = (*start) -> prev; // move (*start) to new node 30 (*start) -> value = ch; // put value in node
Queue underrun occurs when you try to dequeue a node from a queue when the queue is already empty. When dealing with linked list based queues, underrun is much more likely to happen than overrun. In your dequeue function, you must check to see whether the queue is empty before trying to dequeue a node. If you do not you will likely get a segmentation fault.
char dequeue(Node** start, Node** end, int index) { int i; char ch; Node* tmp; tmp = *start; if((*start) != NULL) { //STUFF } else { ch = '\n'; } return(ch); }
The first conditional statement makes sure the queue has a value before trying to dequeue a node. Instead of getting a segmentation fault the function returns the null terminating character indicating there is no node to dequeue. If this is in place you should not have a problem with queue underrun.
The term FIFO stands for “First In First Out”. This type of data structure refers to how data is added/removed from the data structure. A queue is a FIFO data structure because the first node added to a queue will be the first node removed.
The following picture represents how the FIFO queue works.
For more explanation of a queue refer to the queue keyword definition → http://lab46.corning-cc.edu/opus/fall2011/dschoeff/start#queue
Wikipedia states ,“Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other. In this context, a computational problem is understood to be a task that is in principle amenable to being solved by a computer (which basically means that the problem can be stated by a set of mathematical instructions). Informally, a computational problem consists of problem instances and solutions to these problem instances. For example, primality testing is the problem of determining whether a given number is prime or not. The instances of this problem are natural numbers, and the solution to an instance is yes or no based on whether the number is prime or not.”.
From what I understand, Computational Complexity is the classification of a problem according to to how difficult it is to solve. This difficulty is based upon the amount of resources needed to solve the problem. For example, the mathematical problem 2 + 2 is a lot less difficult to find a solution for than (13^2/78)* 10000 and would require less resources to solve the problem. .
References: http://en.wikipedia.org/wiki/Computational_complexity_theory
Big O notation is used to classify algorithms by how they respond to bigger and bigger input sizes. This notation is used in computer science when referring to efficiency of algorithm. This is helpful when trying analyze how the algorithms handle data in what is called as best case, worst case and average case.
When referring to a function using big O notation only the upper bound is given. This is true because when input sizes become larger, the other terms become less influential.
When you see Big O notation it will look something like this: O(n), O(n^2), O(nlogn), O(n^∞)
Wikipedia states “Big O notation is useful when analyzing algorithms for efficiency. For example, the time (or the number of steps) it takes to complete a problem of size n
might be found to be T
(n
) = 4n
2 − 2n
+ 2.
As n
grows large, the n
2 term will come to dominate, so that all other terms can be neglected — for instance when n
= 500, the term 4n
2 is 1000 times as large as the 2n
term. Ignoring the latter would have negligible effect on the expression's value for most purposes.
Further, the coefficients become irrelevant if we compare to any other order of expression, such as an expression containing a term n3 or n4. Even if T
(n
) = 1,000,000n
2, if U
(n
) = n
3, the latter will always exceed the former once n
grows larger than 1,000,000 (T
(1,000,000) = 1,000,0003= U
(1,000,000)). Additionally, the number of steps depends on the details of the machine model on which the algorithm runs, but different types of machines typically vary by only a constant factor in the number of steps needed to execute an algorithm.”
Big Theta notation is very similar to Big O notation. The only difference between these two notations is that when you use big theta notation to represent a function, the upper and lower bounds are of the same order.
References: http://en.wikipedia.org/wiki/Big_O_notation
A sorting algorithm is an algorithm that sorts a given set of data. The most common types of sorting algorithms sort data in numerical or alphabetical order( also commonly known as lexicographical order). Sorting algorithms are very useful because data in a particular order can be very useful.
But all sorting algorithms were not created equal, there are hundreds of commonly used sorting algorithms out there each having its own particular twist on how they sort data.
A few common types of sorting algorithms are defined below. I guess the big thing when choosing what algorithm to use is how it performs to varying degrees of randomness in the data it will sort. The algorithms typically are described using Big O notation in the best case scenario, worst case scenario and average case scenario. The best case is how the algorithm responds to data that in already in the desired order. For example, the Bubble Sort algorithm which will be described later has a best case performance of O(n) meaning it would only have to traverse the data n times before it would complete. Its worst case performance(the data is completely out of order) is O(n^2) meaning the number of iterations would be n^2. The average case performance is also O(n^2).
Different algorithms perform differently to different types of data so it is best to know what you are trying to do before you choose a sorting algorithm.
The Selection Sort is a simple sorting algorithm. It works by taking a data structure(lets say an array) and traversing the entire list to find the minimum value(if you are sorting from smallest to largest). Once it has that, it swaps the elements at the first position and the position of the smallest element. At this point smallest element is at the desired position. It then repeats this process with the second element and so on until it gets to the last element.
The selection sort has a worst case performance O(n^2), a best case performance O(n^2) and an average case performance O(n^2).
Here is an example from Wikipedia showing the progression of this sort.
64 | 25 | 12 | 22 | 11 |
11 | 25 | 12 | 22 | 64 |
11 | 12 | 25 | 22 | 64 |
11 | 12 | 22 | 25 | 64 |
11 | 12 | 22 | 25 | 64 |
This is a particularly inefficient way to sort data. If you are not concerned about efficiency or do not have a large amount of data to sort this would be good to use because of its simplicity.
References: http://en.wikipedia.org/wiki/Selection_sort
The Bubble Sort is a sorting algorithm that steps through a list of elements comparing adjacent elements and swapping them if they are out of order. The logic behind this algorithm is somewhat very simple.
The bubble sort has a worst case performance O(n^2), a best case performance O(n) and an average case performance O(n^2).
The following codes shows an implementation of a bubble sort that allows the user to enter 10 integers and then sorts them in ascending order.
#include <stdio.h> #include <stdlib.h> int main() { int *array; array = (int*) malloc(sizeof(int)*10); int i = 0,input=0,junk; int swapped = 1, tmp; //loads ten values from the user into the array for(i = 0; i < 10; i++) { printf("enter 10 different integers: "); scanf("%d",&input); *(array+i) = input; } //meat of the bubble sort while( swapped == 1) { swapped = 0; for(i = 1; i < 10; i++) { if(*(array+(i-1)) > *(array+i))//this statements checks to see { //if the elements are out of order tmp = *(array+i); *(array+i) = *(array+(i-1)); *(array+(i-1)) = tmp; swapped = 1; } } } //displays the array for(i = 0; i < 10; i++) { printf("[%d]: %d\n", i , *(array+i)); } return (0); }
Execution of this code.
lab46:~/src/data/sorts$ ./bubble enter 10 different integers: 5 enter 10 different integers: 2 enter 10 different integers: 8 enter 10 different integers: 6 enter 10 different integers: 1 enter 10 different integers: 0 enter 10 different integers: 7 enter 10 different integers: 3 enter 10 different integers: 6 enter 10 different integers: 9 [0]: 0 [1]: 1 [2]: 2 [3]: 3 [4]: 5 [5]: 6 [6]: 6 [7]: 7 [8]: 8 [9]: 9 lab46:~/src/data/sorts$
I have done an experiment of how the bubble sort handles differing sets of data. You should check it out here → http://lab46.corning-cc.edu/opus/fall2011/dschoeff/start#experiment_12
Overall, the bubble sort is inefficient in its average case performance and you would be much better of choosing another algorithm to sort your data
The Insertion Sort is a simple sorting algorithm that is good to use when you do not have a large amount of data to sort.
The Insertion Sort has a worst case performance O(n^2), a best case performance O(n) and an average case performance O(n^2).
The easiest way to understand this sort is to see an example. The following example was found of Wikipedia.
Example: The following table shows the steps for sorting the sequence {3, 7, 4, 9, 5, 2, 6, 1}. In each step the item under consideration is underlined and the item moved (or held in place if was biggest yet considered) in the previous step is shown in bold.
3 7 4 9 5 2 6 1
3 7 4 9 5 2 6 1
3 7 4 9 5 2 6 1
3 4 7 9 5 2 6 1
3 4 7 9 5 2 6 1
3 4 5 7 9 2 6 1
2 3 4 5 7 9 6 1
2 3 4 5 6 7 9 1
1 2 3 4 5 6 7 9
What happens in an insertion sort is that every repetition it guarantees that one element will be put in the right order. looking at the list from left to right, the sort takes the first card and since it is the first card it places it back in the list. The first element is now sorted. It then looks at the next element and removes it from the list. If it is greater than the first value it is put back in its place. if it is less than the first element the first element is moved to the right and the second element is placed in the first position. There are now two sorted elements. This process is repeated until all the items are in there proper placement.
References: http://en.wikipedia.org/wiki/Insertion_sort
The Quick sort algorithm is more advanced than the previous algorithms defined above. Wikipedia does a good job at explaining how the algorithm function.
“Quicksort is a divide and conquer algorithm. Quicksort first divides a large list into two smaller sub-lists: the low elements and the high elements. Quicksort can then recursively sort the sub-lists.
The steps are:
The base case of the recursion are lists of size zero or one, which never need to be sorted.”(Wikipedia)
The Quick Sort has a worst case performance O(n^2), a best case performance O(nlogn) and an average case performance O(nlogn).
The Merge Sort algorithm is what you would call a reliable one. The merge sort has a worst case performance O(nlogn), a best case performance O(nlogn) and an average case performance O(nlogn). This algorithm is known as a divide and conquer algorithm because it divides the list of elements up to single elements.
The process goes as follows: “
The following picture shows a graphical representation of how the algorithm functions.
Overall, the merge sort is very reliable meaning that its worst case performance is only O(nlogn) which is much better than some of the algorithms listed above.
References: http://en.wikipedia.org/wiki/Merge_sort
A Binary Search is an algorithm used to search for a value in a sorted list. It does this by starting with a value to find and then compares that value with the middle element of the array. Since the list is sorted you can know whether to search left or right. It then repeats the process until it finds the value or the remaining array is reduced to zero.
given the following array
array[] = |1|13|22|28|30|45|58|98|100|
Lets say the value being searched for is 13.
It starts by locating the middle element. In this case it is 30. Since 13 is less than 30, if 13 is in the first half of the array.
Therefore, the remaining list to be searched is |1|13|22|28|
It then starts with the middle element of that section of the array 22. Since 13 is less than 22 the value either has to be the first or second element in the subsection of the array. It will then find 13 as the second element of that subsection.
When this logic is put into an algorithm it will use recursion. Pseudocode for an implementation is here.
binary_search(Array[0..N-1], value, low, high): if (high < low): return -1 // not found mid = (low + high) / 2 if (A[mid] > value): return binary_search(A, value, low, mid-1) else if (A[mid] < value): return binary_search(A, value, mid+1, high) else: return mid // found
This course objective is to successfully write a program implementing the use of queues.
To do this successfully, I will have to understand the basic concepts of a queue. This requires a knowledge of pointers, structs, struct pointers, memory allocation, memory deallocation and an understanding of linked lists. Success will be measured by the completion of a program that implements these elements efficiently.
The keywords that I have entered at the top of this page show that I have successfully researched the concepts required to make a working queue. Besides having a knowledge of the concepts, the only other measure of success would be to have a program implementing these concepts. You can find such program here → http://lab46.corning-cc.edu/user/dschoeff/portfolio/libqueue
Upon completion of this exercise, I believe I have successfully completed the course objective of creating a program that implements a queue.
Today I learned about various different sorting algorithms. Many of them had their advantages and disadvantages. One of the things we saw when exploring the different methods is what is know as there computational complexity. The computational complexity is basically how with the algorithm handles data that is:
What I want to know is how long will it take the bubble sort algorithm to sort these 3 different types of scenarios listed above.
My knowledge of the bubble sort algorithm has been obtained from Matthew Haas And this webpage → http://en.wikipedia.org/wiki/Bubble_sort
I also have done a project implementing a simple bubble sort algorithm. I think the combined knowledge of the above things will help me in this experiment.
I believe that the array that has values that are completely out of order will take the longest. After that, the array loaded with random value will take the next longest to sort. Finally the array with values that are already in order will take the least amount of time.
I will test my hypothesis by keeping track of the run time of 3 different programs. Each one of these programs will sort an array of integers. The first array will have the elements completely out of order. The next array I will load the array with completely random values. The final array I will load with values that are already in the right order. I will then compare the run times of each of these programs and ponder the results.
Execution of the code that I wrote that used a bubble sort algorithm to sort a list of numbers that were completely out of order. I loaded a 100000 element array with descending numbers starting from 1000000.
lab46:~/src/data/sorts$ time ./bubbleTest > test.txt real 1m44.096s user 1m43.898s sys 0m0.004s lab46:~/src/data/sorts$
I then ran that code and outputted the content of the sorted array to a file. Notice that it took about 1 minute and 44 seconds to complete this task.
[0]: 900001 [1]: 900002 . . . . . [99992]: 999993 [99993]: 999994 [99994]: 999995 [99995]: 999996 [99996]: 999997 [99997]: 999998 [99998]: 999999 [99999]: 1000000
Now I will test to see if there is any difference in an array that is completely out of order to an array that may only be partially out of order. I will load an array of 100000 elements with random numbers from 0 to 1000000. I will then run that array through the sorting algorithm and observe the results.
lab46:~/src/data/sorts$ time ./bubbleTest > random.txt real 1m49.129s user 1m48.951s sys 0m0.012s lab46:~/src/data/sorts$
Notice that it took about the same time to run as the list that was completely out of order.
Finally we will test how long it will take to sort through a list that is already in the right order using the bubble sort algorithm.
lab46:~/src/data/sorts$ time ./bubbleTest > test.txt real 0m0.300s user 0m0.072s sys 0m0.012s lab46:~/src/data/sorts$
Notice that the program completed in a fraction of a second. This was to be expected cause the algorithm didn't really have to do anything.
Based on the data collected:
After completing this experiment, I believe I can state some observations I made.
Over the past couple weeks i have been learning various different sorting algorithms. Many of them had their advantages and disadvantages. One of the things I have seen when exploring the different methods of sorting data is known as there computational complexity. The computational complexity is basically how with the algorithm handles data that is:
What I want to know is how long will it take the selection sort algorithm to sort these 3 different types of scenarios listed above.
My knowledge of the selection sort algorithm has been obtained from C++ Programming by D.S. Malik and this webpage →http://en.wikipedia.org/wiki/Selection_sort
I also have done a project implementing a simple selection sort algorithm. I think the combined knowledge of the above things will help me in this experiment.
I believe that the all the arrays will take about the same time sort. I believe this is true because the algorithm has no way to check to see if the array is already sorted. Therefore must perform the same amount of operations no matter how the data is already ordered.
I will test my hypothesis by keeping track of the run time of 3 different programs. Each one of these programs will sort an array of integers. The first array will have the elements completely out of order. The next array I will load the array with completely random values. The final array I will load with values that are already in the right order. I will then compare the run times of each of these programs and ponder the results.
Execution of the code that I wrote that used a selection sort algorithm to sort a list of numbers that were completely out of order. I loaded a 100000 element array with descending numbers starting from 1000000.
lab46:~/src/data/sorts$ time ./selectionTest > test.txt real 0m33.734s user 0m33.494s sys 0m0.016s lab46:~/src/data/sorts$
I then ran that code and outputted the content of the sorted array to a file. Notice that it took about 34 seconds to complete this task.
[0]: 900001 [1]: 900002 . . . . . [99992]: 999993 [99993]: 999994 [99994]: 999995 [99995]: 999996 [99996]: 999997 [99997]: 999998 [99998]: 999999 [99999]: 1000000
Now I will test to see if there is any difference in an array that is completely out of order to an array that may only be partially out of order. I will load an array of 100000 elements with random numbers from 0 to 1000000. I will then run that array through the sorting algorithm and observe the results.
lab46:~/src/data/sorts$ time ./selectionTest > random.txt real 0m32.299s user 0m32.082s sys 0m0.004s lab46:~/src/data/sorts$
Notice that it took about the same time to run as the list that was completely out of order.
Finally we will test how long it will take to sort through a list that is already in the right order using the selection sort algorithm.
lab46:~/src/data/sorts$ time ./selectionTest > test.txt real 0m32.272s user 0m32.082s sys 0m0.012s lab46:~/src/data/sorts$
Notice that the program in about the same time as the others. This was to be expected since of how the algorithm functions.
Based on the data collected:
After completing this experiment, I believe I can state some observations I made.
I have been studying various different sorting algorithms lately and have wondered how two different ones would match up in a head to head race.
My question is this. Which sorting algorithm sorts data faster? The selection sort or the bubble sort?
My knowledge of these sorting algorithms have been obtained from the projects I have done on them. You can find them here → http://lab46.corning-cc.edu/user/dschoeff/portfolio
I believe that the bubble sort will perform better than the selection sort on data that is already in order where as the selection sort will perform better on data that is not in order.
I will test my hypothesis by running both sorting algorithms on 3 different sets of data. The first set of integers will be completely out of order. The next set will be randomly filled with integers. The final set will be filled with all the integers in the correct order.
Execution of the code that I wrote that used a bubble sort algorithm to sort a list of numbers that were completely out of order. I loaded a 100000 element array with descending numbers starting from 1000000.
lab46:~/src/data/sorts$ time ./bubbleTest > test.txt real 1m44.096s user 1m43.898s sys 0m0.004s lab46:~/src/data/sorts$
I then ran that code and outputted the content of the sorted array to a file. Notice that it took about 1 minute and 44 seconds to complete this task.
Execution of the code that I wrote that used a selection sort algorithm to sort a list of numbers that were completely out of order. I loaded a 100000 element array with descending numbers starting from 1000000.
lab46:~/src/data/sorts$ time ./selectionTest > test.txt real 0m33.734s user 0m33.494s sys 0m0.016s lab46:~/src/data/sorts$
I then ran that code and outputted the content of the sorted array to a file. Notice that it took about 34 seconds to complete this task. Round 1 goes to the selection sort.
Now I will test the two different sorts on an array of 100000 elements with random numbers from 0 to 1000000. I will then run that array through the sorting algorithms and observe the results. First the bubble sort.
lab46:~/src/data/sorts$ time ./bubbleTest > random.txt real 1m49.129s user 1m48.951s sys 0m0.012s lab46:~/src/data/sorts$
Notice that it took about 1 minute 50 seconds to complete.
Next, lets run the selection sort on the same data.
lab46:~/src/data/sorts$ time ./selectionTest > random.txt real 0m32.299s user 0m32.082s sys 0m0.004s lab46:~/src/data/sorts$
Notice that it took about 33 seconds to run. Round 2 goes to the selection sort.
Finally we will test how long it will take to sort through a list that is already in the right order using the bubble sort algorithm.
lab46:~/src/data/sorts$ time ./bubbleTest > test.txt real 0m0.300s user 0m0.072s sys 0m0.012s lab46:~/src/data/sorts$
Notice that the program completed in a fraction of a second. This was to be expected cause the algorithm didn't really have to do anything.
Next lets try the selection sort.
lab46:~/src/data/sorts$ time ./selectionTest > test.txt real 0m32.272s user 0m32.082s sys 0m0.012s lab46:~/src/data/sorts$
Notice that it took about 33 seconds to run. Round 3 goes to the bubble sort.
Based on the data collected:
After completing this experiment, I believe I can state some observations I made.