======Part 3======
=====Entries=====
====November 7th, 2011====
Happy November! This being my first official dated entry of the new month brings along great news! I have officially completed my stack library implementation and it is awesome. You should definitely check it out here -> [[user:dschoeff:portfolio:libstack|Project: Stack Library Implementation]]. I've learned so much doing this stack implementation. I ran into a lot of problems doing this partly because I didn't have a good understanding of stacks going into the project. This led me to research about stacks more and increase my knowledge of how they worked. I don't think I could've done this project without the use of the debugger gdb. It came in so handy when trying to find out why I was getting a segmentation fault.
I still want to increase my knowledge of struct pointers because having a better understanding of them will help lead to better code in the future. I see the power of linked list type data structures and can't wait to explore more of them.
====November 11th, 2011====
Well today has been fairly productive. I successfully completed my queue library implementation. I also submitted all my files to the subversion repository. You can view my queue project
here -> [[http://lab46.corning-cc.edu/user/dschoeff/portfolio/libqueue]]. I was able to work through a few problems I was having with my dequeue() function, but I stopped by the lair
today and Matt helped me out. My next thing I want to work on is my keywords and experiments. I hope to get these finished up soon.
====November 14th, 2011====
Today I decided I would go do corrections to my opus. I had completed my opus on time and everything but I didn't do that good of a job of demonstrating some of the keywords I defined. So I went back and tried to better demonstrate them with code snippets and programs. I also included some command line execution of programs I had written. I think the keywords can be understood a lot better now with the new demonstrations.
====November 18th, 2011====
So I'm back dating this entry cause I forgot to do it on Friday.
Today was pretty cool. In Data Structures we started talking about various sorting algorithms. What we did was basically look on Wikipedia at the different sorting algorithms out there. Each type of sorting algorithm has its pros and cons but they all basically do they same thing which is to sort data. When choosing a sorting algorithm to use you need to take into account your own specific implementation and what it will be used for.
After learning about these things in class Matt let us loose and had us write our own simple algorithm. I chose to write a bubble sort algorithm which is not the most efficient algorithm but is simple to understand. I plan on writing a few more sorting algorithms implementing different methods.
After writing this algorithm for a project I decided to do some experiments on how efficient the algorithms really were. It nice to see for your self how efficient the algorithms are at sorting data rather than just taking someone else's word for it.
=====Data Topics=====
====Queue Overrun====
When referring to queues, an **overrun** condition occurs when the queue exceeds the space available to store data. In a array based queue, this would happen when trying to add a value
to an array that is all ready full. Since I haven't dealt much with array based queue implementations I will stick mainly to explaining linked list based implementations. With a linked
list based queue, it would seem to me that an overrun condition would be much harder to get. When you think about linked lists, they do not have a fixed size. The only size limitation
that the list size has(unless the program designer were to arbitrarily set a size limit for the list) is how much memory is available to the system. Although the chance of overrun is somewhat
unlikely, a function that enqueues a node to a queue should check for this condition and be flexible enough to handle that situation. I have done an experiment dealing with system memory
management here -> [[http://lab46.corning-cc.edu/opus/fall2011/dschoeff/start#experiment_3]]. This experiment explores dynamic memory allocation and how lab46 deals with alot of memory usage.
Look at the following is a snippet of my enqueue function. If the system that you were running your program on ran out of memory, a problem would occur at line 26 when you try to allocate
memory for a new node when there is no more memory available.
26 tmp2 = (Node *) malloc (sizeof(Node));
27 tmp2 -> next = (*start); // point new node's next at (*start)
28 (*start) -> prev = tmp2; // point (*start)'s prev at tmp2
29 (*start) = (*start) -> prev; // move (*start) to new node
30 (*start) -> value = ch; // put value in node
====Queue Underrun====
Queue underrun occurs when you try to dequeue a node from a queue when the queue is already empty. When dealing with linked list based queues, underrun is much more likely to happen than overrun.
In your dequeue function, you must check to see whether the queue is empty before trying to dequeue a node. If you do not you will likely get a segmentation fault.
char dequeue(Node** start, Node** end, int index)
{
int i;
char ch;
Node* tmp;
tmp = *start;
if((*start) != NULL)
{
//STUFF
}
else
{
ch = '\n';
}
return(ch);
}
The first conditional statement makes sure the queue has a value before trying to dequeue a node. Instead of getting a segmentation fault the function returns the null terminating character indicating there is no node to dequeue.
If this is in place you should not have a problem with queue underrun.
====FIFO====
The term **FIFO** stands for "First In First Out". This type of data structure refers to how data is added/removed from the data structure. A queue is a FIFO data structure because the
first node added to a queue will be the first node removed.
The following picture represents how the FIFO queue works.
{{:opus:fall2011:dschoeff:300px-data_queue.svg.png?nolink&200|}}
For more explanation of a queue refer to the queue keyword definition -> [[http://lab46.corning-cc.edu/opus/fall2011/dschoeff/start#queue]]
====Computational Complexity====
Wikipedia states ,"Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other. In this context, a computational problem is understood to be a task that is in principle amenable to being solved by a computer (which basically means that the problem can be stated by a set of mathematical instructions). Informally, a computational problem consists of problem instances and solutions to these problem instances. For example, primality testing is the problem of determining whether a given number is prime or not. The instances of this problem are natural numbers, and the solution to an instance is yes or no based on whether the number is prime or not.".
From what I understand, Computational Complexity is the classification of a problem according to to how difficult it is to solve. This difficulty is based upon the amount of resources
needed to solve the problem. For example, the mathematical problem 2 + 2 is a lot less difficult to find a solution for than (13^2/78)* 10000 and would require less resources to solve the problem.
.
References: [[http://en.wikipedia.org/wiki/Computational_complexity_theory]]
====Big O/Theta, Bounds====
Big O notation is used to classify algorithms by how they respond to bigger and bigger input sizes. This notation is used in computer science when referring to efficiency of algorithm. This is helpful when trying analyze how the algorithms handle data in what is called as best case, worst case and average case.
When referring to a function using big O notation only the upper bound is given. This is true because when input sizes become larger, the other terms become less influential.
When you see Big O notation it will look something like this: O(n), O(n^2), O(nlogn), O(n^∞)
Wikipedia states "Big O notation is useful when analyzing algorithms for efficiency. For example, the time (or the number of steps) it takes to complete a problem of size ''n'' might be found to be ''T''(''n'') = 4''n''2 − 2''n'' + 2.
As ''n'' grows large, the ''n''2 term will come to dominate, so that all other terms can be neglected — for instance when ''n'' = 500, the term 4''n''2 is 1000 times as large as the 2''n'' term. Ignoring the latter would have negligible effect on the expression's value for most purposes.
Further, the coefficients become irrelevant if we compare to any other order of expression, such as an expression containing a term n3 or n4. Even if ''T''(''n'') = 1,000,000''n''2, if ''U''(''n'') = ''n''3, the latter will always exceed the former once ''n'' grows larger than 1,000,000 (''T''(1,000,000) = 1,000,0003= ''U''(1,000,000)). Additionally, the number of steps depends on the details of the machine model on which the algorithm runs, but different types of machines typically vary by only a constant factor in the number of steps needed to execute an algorithm."
Big Theta notation is very similar to Big O notation. The only difference between these two notations is that when you use big theta notation to represent a function, the upper and lower bounds are of the same order.
References: [[http://en.wikipedia.org/wiki/Big_O_notation]]
====Sorting Algorithms====
A sorting algorithm is an algorithm that sorts a given set of data. The most common types of sorting algorithms sort data in numerical or alphabetical order( also commonly known as lexicographical order). Sorting algorithms are very useful because data in a particular order can be very useful.
But all sorting algorithms were not created equal, there are hundreds of commonly used sorting algorithms out there each having its own particular twist on how they sort data.
A few common types of sorting algorithms are defined below. I guess the big thing when choosing what algorithm to use is how it performs to varying degrees of randomness in the data it will sort. The algorithms typically are described using Big O notation in the best case scenario, worst case scenario and average case scenario. The best case is how the algorithm responds to data that in already in the desired order. For example, the Bubble Sort algorithm which will be described later has a best case performance of O(n) meaning it would only have to traverse the data n times before it would complete. Its worst case performance(the data is completely out of order) is O(n^2) meaning the number of iterations would be n^2. The average case performance is also O(n^2).
Different algorithms perform differently to different types of data so it is best to know what you are trying to do before you choose a sorting algorithm.
====Selection Sort====
The Selection Sort is a simple sorting algorithm. It works by taking a data structure(lets say an array) and traversing the entire list to find the minimum value(if you are sorting from smallest to largest). Once it has that, it swaps the elements at the first position and the position of the smallest element. At this point smallest element is at the desired position. It then repeats this process with the second element and so on until it gets to the last element.
The selection sort has a worst case performance O(n^2), a best case performance O(n^2) and an average case performance O(n^2).
Here is an example from Wikipedia showing the progression of this sort.
|64| 25| 12| 22| 11|
|11| 25| 12| 22| 64|
|11| 12| 25| 22| 64|
|11| 12| 22| 25| 64|
|11| 12| 22| 25| 64|
This is a particularly inefficient way to sort data. If you are not concerned about efficiency or do not have a large amount of data to sort this would be good to use because of its simplicity.
References: [[http://en.wikipedia.org/wiki/Selection_sort]]
====Bubble Sort====
The Bubble Sort is a sorting algorithm that steps through a list of elements comparing adjacent elements and swapping them if they are out of order. The logic behind this algorithm is somewhat very simple.
The bubble sort has a worst case performance O(n^2), a best case performance O(n) and an average case performance O(n^2).
The following codes shows an implementation of a bubble sort that allows the user to enter 10 integers and then sorts them in ascending order.
#include
#include
int main()
{
int *array;
array = (int*) malloc(sizeof(int)*10);
int i = 0,input=0,junk;
int swapped = 1, tmp;
//loads ten values from the user into the array
for(i = 0; i < 10; i++)
{
printf("enter 10 different integers: ");
scanf("%d",&input);
*(array+i) = input;
}
//meat of the bubble sort
while( swapped == 1)
{
swapped = 0;
for(i = 1; i < 10; i++)
{
if(*(array+(i-1)) > *(array+i))//this statements checks to see
{ //if the elements are out of order
tmp = *(array+i);
*(array+i) = *(array+(i-1));
*(array+(i-1)) = tmp;
swapped = 1;
}
}
}
//displays the array
for(i = 0; i < 10; i++)
{
printf("[%d]: %d\n", i , *(array+i));
}
return (0);
}
Execution of this code.
lab46:~/src/data/sorts$ ./bubble
enter 10 different integers: 5
enter 10 different integers: 2
enter 10 different integers: 8
enter 10 different integers: 6
enter 10 different integers: 1
enter 10 different integers: 0
enter 10 different integers: 7
enter 10 different integers: 3
enter 10 different integers: 6
enter 10 different integers: 9
[0]: 0
[1]: 1
[2]: 2
[3]: 3
[4]: 5
[5]: 6
[6]: 6
[7]: 7
[8]: 8
[9]: 9
lab46:~/src/data/sorts$
I have done an experiment of how the bubble sort handles differing sets of data. You should check it out here -> [[http://lab46.corning-cc.edu/opus/fall2011/dschoeff/start#experiment_12]]
Overall, the bubble sort is inefficient in its average case performance and you would be much better of choosing another algorithm to sort your data
====Insertion Sort====
The Insertion Sort is a simple sorting algorithm that is good to use when you do not have a large amount of data to sort.
The Insertion Sort has a worst case performance O(n^2), a best case performance O(n) and an average case performance O(n^2).
The easiest way to understand this sort is to see an example. The following example was found of Wikipedia.
Example: The following table shows the steps for sorting the sequence {3, 7, 4, 9, 5, 2, 6, 1}. In each step the item under consideration
is underlined and the item moved (or held in place if was biggest yet considered) in the previous step is shown in bold.
__3__ 7 4 9 5 2 6 1
**3** __7__ 4 9 5 2 6 1
3 **7** __4__ 9 5 2 6 1
3 **4** 7 __9__ 5 2 6 1
3 4 7 **9** __5__ 2 6 1
3 4 **5** 7 9 __2__ 6 1
**2** 3 4 5 7 9 __6__ 1
2 3 4 5 **6** 7 9 __1__
**1** 2 3 4 5 6 7 9
What happens in an insertion sort is that every repetition it guarantees that one element will
be put in the right order. looking at the list from left to right, the sort takes the first card and since it is the first card
it places it back in the list. The first element is now sorted. It then looks at the next element and removes it from the list.
If it is greater than the first value it is put back in its place. if it is less than the first element the first element is moved to the right
and the second element is placed in the first position. There are now two sorted elements. This process is repeated until all the items are in there proper placement.
References: [[http://en.wikipedia.org/wiki/Insertion_sort]]
====Quick Sort====
The Quick sort algorithm is more advanced than the previous algorithms defined above. Wikipedia does a good job
at explaining how the algorithm function.
"Quicksort is a divide and conquer algorithm. Quicksort first divides a large list into two smaller sub-lists: the low elements and the high elements. Quicksort can then recursively sort the sub-lists.
The steps are:
- Pick an element, called a pivot, from the list.
- Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
- Recursively sort the sub-list of lesser elements and the sub-list of greater elements.
The base case of the recursion are lists of size zero or one, which never need to be sorted."(Wikipedia)
The Quick Sort has a worst case performance O(n^2), a best case performance O(nlogn) and an average case performance O(nlogn).
====Merge Sort====
The Merge Sort algorithm is what you would call a reliable one.
The merge sort has a worst case performance O(nlogn), a best case performance O(nlogn) and an average case performance O(nlogn).
This algorithm is known as a divide and conquer algorithm because it divides the list of elements up to single elements.
The process goes as follows:
"
- If the list is of length 0 or 1, then it is already sorted. Otherwise:
- Divide the unsorted list into two sublists of about half the size.
- Sort each sublist recursively by re-applying the merge sort.
- Merge the two sublists back into one sorted list."(Wikipedia).
The following picture shows a graphical representation of how the algorithm functions.
{{:opus:fall2011:dschoeff:merge_sort.png?nolink&600|}}
Overall, the merge sort is very reliable meaning that its worst case performance is only O(nlogn) which is much better than
some of the algorithms listed above.
References: [[http://en.wikipedia.org/wiki/Merge_sort]]
====Binary Search====
A Binary Search is an algorithm used to search for a value in a sorted
list. It does this by starting with a value to find and then compares that
value with the middle element of the array. Since the list is sorted you
can know whether to search left or right. It then repeats the process
until it finds the value or the remaining array is reduced to zero.
given the following array
array[] = |1|13|22|28|30|45|58|98|100|
Lets say the value being searched for is 13.
It starts by locating the middle element. In this case it is 30. Since 13 is less than 30, if 13 is
in the first half of the array.
Therefore, the remaining list to be searched is |1|13|22|28|
It then starts with the middle element of that section of the array 22. Since 13 is less than 22 the value either has to be
the first or second element in the subsection of the array. It will then find 13 as the second element of that subsection.
When this logic is put into an algorithm it will use recursion. Pseudocode for an implementation
is here.
binary_search(Array[0..N-1], value, low, high):
if (high < low):
return -1 // not found
mid = (low + high) / 2
if (A[mid] > value):
return binary_search(A, value, low, mid-1)
else if (A[mid] < value):
return binary_search(A, value, mid+1, high)
else:
return mid // found
=====Data Objective=====
====Write a program that uses queues====
This course objective is to successfully write a program implementing the use of queues.
===Method===
To do this successfully, I will have to understand the basic concepts of a queue. This requires a knowledge of pointers,
structs, struct pointers, memory allocation, memory deallocation and an understanding of linked lists. Success will be measured by the completion of a program
that implements these elements efficiently.
===Measurement===
The keywords that I have entered at the top of this page show that I have successfully researched the concepts required to
make a working queue. Besides having a knowledge of the concepts, the only other measure of success would be to have a
program implementing these concepts. You can find such program here -> [[http://lab46.corning-cc.edu/user/dschoeff/portfolio/libqueue]]
===Analysis===
Upon completion of this exercise, I believe I have successfully completed the course objective of creating a program that
implements a queue.
-I believe could improve my functions so that they could be more flexible. For example, I currently only allow for each nodes value to be a character. I could change a few things around and allow for each node to hold more information.
-Since the measurement process is based upon having a knowledge of the concepts required to build a queue and then to implement them in a program, I don't see to many ways to improve that using the current measurement system. You either have a working program or not.
=====Experiments=====
====Experiment 1====
===Question===
Today I learned about various different sorting algorithms. Many of them had their advantages and disadvantages.
One of the things we saw when exploring the different methods is what is know as there computational complexity. The computational complexity is basically
how with the algorithm handles data that is:
- completely in order
- completely out of order
- somewhere in the middle.
What I want to know is how long will it take the bubble sort algorithm to sort these 3 different types of scenarios listed above.
===Resources===
My knowledge of the bubble sort algorithm has been obtained from Matthew Haas And this webpage -> [[http://en.wikipedia.org/wiki/Bubble_sort]]
I also have done a project implementing a simple bubble sort algorithm. I think the combined knowledge of the above things will help me in this experiment.
===Hypothesis===
I believe that the array that has values that are completely out of order will take the longest. After that, the array loaded with random value will take the next longest to sort.
Finally the array with values that are already in order will take the least amount of time.
===Experiment===
I will test my hypothesis by keeping track of the run time of 3 different programs. Each one of these programs will sort an array of integers. The first array will have the elements completely out of order.
The next array I will load the array with completely random values. The final array I will load with values that are already in the right order. I will then compare the run times of each of these programs and
ponder the results.
===Data===
Execution of the code that I wrote that used a bubble sort algorithm to sort a list of numbers that were completely out of order.
I loaded a 100000 element array with descending numbers starting from 1000000.
lab46:~/src/data/sorts$ time ./bubbleTest > test.txt
real 1m44.096s
user 1m43.898s
sys 0m0.004s
lab46:~/src/data/sorts$
I then ran that code and outputted the content of the sorted array to a file. Notice that it took
about 1 minute and 44 seconds to complete this task.
===test.txt===
[0]: 900001
[1]: 900002
.
.
.
.
.
[99992]: 999993
[99993]: 999994
[99994]: 999995
[99995]: 999996
[99996]: 999997
[99997]: 999998
[99998]: 999999
[99999]: 1000000
Now I will test to see if there is any difference in an array that
is completely out of order to an array that may only be partially out of order. I will load an array of 100000
elements with random numbers from 0 to 1000000. I will then run that array through the sorting algorithm and
observe the results.
lab46:~/src/data/sorts$ time ./bubbleTest > random.txt
real 1m49.129s
user 1m48.951s
sys 0m0.012s
lab46:~/src/data/sorts$
Notice that it took about the same time to run as the list that was completely out of order.
Finally we will test how long it will take to sort through a list that is already in the right order using the bubble sort algorithm.
lab46:~/src/data/sorts$ time ./bubbleTest > test.txt
real 0m0.300s
user 0m0.072s
sys 0m0.012s
lab46:~/src/data/sorts$
Notice that the program completed in a fraction of a second. This was to be expected cause the algorithm didn't really have to do anything.
===Analysis===
Based on the data collected:
* was your hypothesis correct? Not entirely. I did state that the array that was already in order would take the least amount of time to sort but I did not notice any significant difference between the array completely out of order and the one that was only partially out of order.
* was your hypothesis not applicable? I think it was applicable just not entirely correct.
* is there more going on than you originally thought? I don't believe so. I think I have a fairly good understanding of how the sorting algorithm works.
* what shortcomings might there be in your experiment? There are none that I am aware of.
* what shortcomings might there be in your data? I am using lab46's built in commands **time** to record how long each program takes to run. I don't actually time the run time of only the specific sorting algorithm inside the program. This may effect the results slightly but I don't believe this is a big problem.
===Conclusions===
After completing this experiment, I believe I can state some observations I made.
- The bubble sort is very effective sorting algorithm when the list of elements is already in order.
- The bubble sort in average case scenario for elements being out of order is not much faster than it is when the list is completely out of order.
====Experiment 2====
===Question===
Over the past couple weeks i have been learning various different sorting algorithms. Many of them had their advantages and disadvantages.
One of the things I have seen when exploring the different methods of sorting data is known as there computational complexity. The computational complexity is basically
how with the algorithm handles data that is:
- completely in order
- completely out of order
- somewhere in the middle.
What I want to know is how long will it take the selection sort algorithm to sort these 3 different types of scenarios listed above.
===Resources===
My knowledge of the selection sort algorithm has been obtained from C++ Programming by D.S. Malik and this webpage ->[[http://en.wikipedia.org/wiki/Selection_sort]]
I also have done a project implementing a simple selection sort algorithm. I think the combined knowledge of the above things will help me in this experiment.
===Hypothesis===
I believe that the all the arrays will take about the same time sort. I believe this is true because the algorithm has no
way to check to see if the array is already sorted. Therefore must perform the same amount of operations no matter how the data is already ordered.
===Experiment===
I will test my hypothesis by keeping track of the run time of 3 different programs. Each one of these programs will sort an array of integers. The first array will have the elements completely out of order.
The next array I will load the array with completely random values. The final array I will load with values that are already in the right order. I will then compare the run times of each of these programs and
ponder the results.
===Data===
Execution of the code that I wrote that used a selection sort algorithm to sort a list of numbers that were completely out of order.
I loaded a 100000 element array with descending numbers starting from 1000000.
lab46:~/src/data/sorts$ time ./selectionTest > test.txt
real 0m33.734s
user 0m33.494s
sys 0m0.016s
lab46:~/src/data/sorts$
I then ran that code and outputted the content of the sorted array to a file. Notice that it took
about 34 seconds to complete this task.
===test.txt===
[0]: 900001
[1]: 900002
.
.
.
.
.
[99992]: 999993
[99993]: 999994
[99994]: 999995
[99995]: 999996
[99996]: 999997
[99997]: 999998
[99998]: 999999
[99999]: 1000000
Now I will test to see if there is any difference in an array that
is completely out of order to an array that may only be partially out of order. I will load an array of 100000
elements with random numbers from 0 to 1000000. I will then run that array through the sorting algorithm and
observe the results.
lab46:~/src/data/sorts$ time ./selectionTest > random.txt
real 0m32.299s
user 0m32.082s
sys 0m0.004s
lab46:~/src/data/sorts$
Notice that it took about the same time to run as the list that was completely out of order.
Finally we will test how long it will take to sort through a list that is already in the right order using the selection sort algorithm.
lab46:~/src/data/sorts$ time ./selectionTest > test.txt
real 0m32.272s
user 0m32.082s
sys 0m0.012s
lab46:~/src/data/sorts$
Notice that the program in about the same time as the others. This was to be expected since of how the algorithm functions.
===Analysis===
Based on the data collected:
* was your hypothesis correct? Yes. I stated that the algorithm would handle the 3 different data set about the same and it did.
* was your hypothesis not applicable? I think it was applicable.
* is there more going on than you originally thought? I don't believe so. I think I have a fairly good understanding of how the sorting algorithm works.
* what shortcomings might there be in your experiment? There are none that I am aware of.
* what shortcomings might there be in your data? I am using lab46's built in commands **time** to record how long each program takes to run. I don't actually time the run time of only the specific sorting algorithm inside the program. This may effect the results slightly but I don't believe this is a big problem.
===Conclusions===
After completing this experiment, I believe I can state some observations I made.
- In the selection sort algorithm you know what you are going to get. No matter whether the data is in order or out of order it will take the same amount of time to run. This is not good be cause that is O(n^2) for each case.
- Overall, this is not a good sorting algorithm. You will be much better off using a different one.
====Experiment 3====
===Question===
I have been studying various different sorting algorithms lately and have wondered how two different ones would match up in a head to head race.
My question is this. Which sorting algorithm sorts data faster? The selection sort or the bubble sort?
===Resources===
My knowledge of these sorting algorithms have been obtained from the projects I have done on them. You can find them here -> [[http://lab46.corning-cc.edu/user/dschoeff/portfolio]]
===Hypothesis===
I believe that the bubble sort will perform better than the selection sort on data that is already in order where as the selection sort will perform better on data that is not in order.
===Experiment===
I will test my hypothesis by running both sorting algorithms on 3 different sets of data. The first set of integers will be completely out of order. The next set will be randomly filled with integers.
The final set will be filled with all the integers in the correct order.
===Data===
Execution of the code that I wrote that used a bubble sort algorithm to sort a list of numbers that were completely out of order.
I loaded a 100000 element array with descending numbers starting from 1000000.
lab46:~/src/data/sorts$ time ./bubbleTest > test.txt
real 1m44.096s
user 1m43.898s
sys 0m0.004s
lab46:~/src/data/sorts$
I then ran that code and outputted the content of the sorted array to a file. Notice that it took
about 1 minute and 44 seconds to complete this task.
Execution of the code that I wrote that used a selection sort algorithm to sort a list of numbers that were completely out of order.
I loaded a 100000 element array with descending numbers starting from 1000000.
lab46:~/src/data/sorts$ time ./selectionTest > test.txt
real 0m33.734s
user 0m33.494s
sys 0m0.016s
lab46:~/src/data/sorts$
I then ran that code and outputted the content of the sorted array to a file. Notice that it took
about 34 seconds to complete this task. Round 1 goes to the selection sort.
Now I will test the two different sorts on an array of 100000
elements with random numbers from 0 to 1000000. I will then run that array through the sorting algorithms and
observe the results. First the bubble sort.
lab46:~/src/data/sorts$ time ./bubbleTest > random.txt
real 1m49.129s
user 1m48.951s
sys 0m0.012s
lab46:~/src/data/sorts$
Notice that it took about 1 minute 50 seconds to complete.
Next, lets run the selection sort on the same data.
lab46:~/src/data/sorts$ time ./selectionTest > random.txt
real 0m32.299s
user 0m32.082s
sys 0m0.004s
lab46:~/src/data/sorts$
Notice that it took about 33 seconds to run. Round 2 goes to the selection sort.
Finally we will test how long it will take to sort through a list that is already in the right order using the bubble sort algorithm.
lab46:~/src/data/sorts$ time ./bubbleTest > test.txt
real 0m0.300s
user 0m0.072s
sys 0m0.012s
lab46:~/src/data/sorts$
Notice that the program completed in a fraction of a second. This was to be expected cause the algorithm didn't really have to do anything.
Next lets try the selection sort.
lab46:~/src/data/sorts$ time ./selectionTest > test.txt
real 0m32.272s
user 0m32.082s
sys 0m0.012s
lab46:~/src/data/sorts$
Notice that it took about 33 seconds to run. Round 3 goes to the bubble sort.
===Analysis===
Based on the data collected:
* was your hypothesis correct? Yes. The bubble sort performed best on the data set that was already in order and the selection sort performed better on the other two.
* was your hypothesis not applicable? I think it was applicable.
* is there more going on than you originally thought? I don't believe so.
* what shortcomings might there be in your experiment? There are none that I am aware of.
* what shortcomings might there be in your data? I am using lab46's built in commands **time** to record how long each program takes to run. I don't actually time the run time of only the specific sorting algorithm inside the program. This may effect the results slightly but I don't believe this is a big problem.
===Conclusions===
After completing this experiment, I believe I can state some observations I made.
- The bubble sort is better to use than the selection sort when the list is already in order.
- The selection sort is better to use when the data is not in order.
- Since your data is almost never in the right order, the selection sort seems like it would be a better one overall to use.