Corning Community College
CSCS2330 Discrete Structures
~~TOC~~
======Project: OPTIMIZING ALGORITHMS WITH SPACE - PRIME NUMBER CALCULATION (pnc2)======
=====Objective=====
To apply your skills in algorithmic optimization through the implementation of improved prime number calculating programs.
=====Algorithmic Complexity=====
A concept in Computer Science curriculum is the notion of computational/algorithmic complexity.
Basically, a solution to a problem exists on a spectrum of efficiency (typically constrained by time vs. space): if optimizing for time, the code size tends to grow.
In the previous projects, we focused on algorithms that were constrained by time- taking progressively more time the larger the data set to be processed. These time-constrained algorithms also happens to be the type of algorithm you're most familiar with (at least, were indoctrinated into thinking... your third grade self potentially would have found more familiarity with these space-constrained methods).
In contrast to time-constrained approaches we have space-constrained approaches, which will utilize more space in the solving of the program to the benefit of requiring less time.
We can never be entirely time- or space-constrained... some element of the other will always be present. But by embracing the advantages of an approach, we can really make an impact in the desired area.
What we will be looking at in this project is a type of prime number calculating algorithm known as a sieve. It will utilize space in solving the problem instead of exclusively using time.
=====Optimizing the prime number calculation=====
Following will be the optimized algorithms I'd like you to implement for comparison with all the others.
For this project, please assume the following:
* you are doing all values (odds AND evens).
* if you exceed 64 million and get "--error!---" that's okay. I have memory limits in place (just like I have time limits in place).
* if applicable, you can base your code on primebruteopt, but you may find the sieve algorithms to be different enough where you may not want to adapt existing code, but start fresh.
====sieve of Eratosthenes (primesieveoferat)====
The first sieve we will be implementing will be one of the best known and likely longest-known sieves, the sieve of Eratosthenes.
The sieve, instead of calculating to determine the eligibility of a prime, works in a manner of marking off patterns of values that cannot be prime (so, it is composite-focused in approach vs. prime-focused).
In order for it to work, we must store all the values we're processing so we can obtain what is left when done-- what remains are the prime values.
Here is the wikipedia page: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
Here is an animated image of this sieve in action (from wikipedia):
{{:haas:fall2016:discrete:projects:sieve_of_eratosthenes_animation.gif|}}
This YouTube video may also be helpful:
[[https://www.youtube.com/watch?v=V08g_lkKj6Q|Sieve of Eratosthenes]]
A basic outline of the algorithm (again, from the wikipedia page):
===Overview of Algorithm===
To find all the prime numbers less than or equal to a given integer n by Eratosthenes' method:
- Create a list of consecutive integers from 2 through n: (2, 3, 4, ..., n).
- Initially, let p equal 2, the smallest prime number.
- Enumerate the multiples of p by counting to n from 2p in increments of p, and mark them in the list (these will be 2p, 3p, 4p, ... ; the p itself should not be marked).
- Find the first number greater than p in the list that is not marked. If there was no such number, stop. Otherwise, let p now equal this new number (which is the next prime), and repeat from step 3.
When the algorithm terminates, the numbers remaining not marked in the list are all the primes below n.
===Program Implementation===
As this is a space-constrained algorithm, we will need a chunk of space to store these values. Think about what a lot of this looks like with respect to how you know how to organize data.
====sieve of Sundaram (primesieveofsund)====
As with any approach, there are multiple ways to go about implementing it. Just as our brute force time-constrained algorithms could benefit from algorithmic optimizations, so too can our sieves.
The other program I would like you to write is an implementation of the sieve of Sundaram algorithm.
Here is the wikipedia page: https://en.wikipedia.org/wiki/Sieve_of_Sundaram
Here is an animated image of this sieve in action (from wikipedia):
{{:haas:fall2016:discrete:projects:sieve_of_sundaram.gif|}}
Here's another page that discusses this sieve: [[https://plus.maths.org/content/sundarams-sieve|Sieve of Sundaram]]
===Overview of algorithm===
* Receive **max** value from command-line
* Produce an array of the values from 0 to **max**.
* Disqualify all values that are of the form (**i** + **j** + (2*i*j)), provided:
* **i** and **j** are less than **max**
* **j** is always greater than or equal to **i**
* **i** is greater than or equal to 1 (one)
* The equation (**i** + **j** + (2*i*j)) is less than or equal to **max**
* Multiply the remaining numbers by 2, and add 1.
This process will yield prime numbers less than 2*max + 2.
=====Program=====
It is your task to write some optimized prime number calculating programs:
- **primesieveoferat.c**: using the space-oriented sieve algorithm
- **primesieveofsund.c**: another sieve
Your program should:
* obtain 1 parameter from the command-line (see **command-line arguments** section below):
* argv[1]: maximum value to calculate to (your program should run from (approximately) 2 through that number (inclusive of that number)
* this value should be a positive integer value; you can make the assumption that the user will always do the right thing.
* do the specified algorithmic optimizations
* please take note in differences in run-time, contemplating the impact the various algorithms and approaches have on performance.
* start your stopwatch (see **timing** section below):
* perform the correct algorithm against the input
* display (to STDOUT) the prime numbers found in the range
* output the processing run-time to STDERR
* your output **MUST** be conformant to the example output in the **execution** section below. This is also a test to see how well you can implement to specifications. Basically:
* as primes are being displayed, they are space-separated (first prime hugs the left margin), and when all said and done, a newline is issued.
* the timing information will be displayed in accordance to code I will provide (in the **timing** section).
=====Grabit Integration=====
For those familiar with the **grabit** tool on lab46, I have made some skeleton files and a custom **Makefile** available for this project.
To "grab" it:
lab46:~/src/discrete$ grabit discrete pnc2
make: Entering directory '/var/public/fall2016/discrete/pnc2'
‘/var/public/fall2016/discrete/pnc2/Makefile’ -> ‘/home/USERNAME/src/discrete/pnc2/Makefile’
‘/var/public/fall2016/discrete/pnc2/primesieveoferat.c’ -> ‘/home/USERNAME/src/discrete/pnc2/primesieveoferat.c’
‘/var/public/fall2016/discrete/pnc2/primesieveofsund.c’ -> ‘/home/USERNAME/src/discrete/pnc2/primesieveofsund.c’
make: Leaving directory '/var/public/fall2016/discrete/pnc2'
lab46:~/src/discrete$ cd pnc2
lab46:~/src/discrete/pnc2$ ls
Makefile primesieveoferat.c primesieveofsund.c
lab46:~/src/discrete/pnc2$
Furthermore, if your pnc2/ project directory is next to a pnc1/ and pnc0/ directory, each containing those project's specific prime variants, you can symlink them into the current project directory with a **make link**:
lab46:~/src/discrete/pnc2$ make link
‘./primebrute.c’ -> ‘../pnc0/primebrute.c’
‘./primebruteopt.c’ -> ‘../pnc0/primebruteopt.c’
‘./primeodds.c’ -> ‘../pnc1/primeodds.c’
‘./primesqrt.c’ -> ‘../pnc1/primesqrt.c’
‘./primesqrtodds.c’ -> ‘../pnc1/primesqrtodds.c’
‘./primesqrtopt.c’ -> ‘../pnc1/primesqrtopt.c’
‘./primesqrtoptodds.c’ -> ‘../pnc1/primesqrtoptodds.c’
lab46:~/src/discrete/pnc2$
And, of course, your basic compile and clean-up operations:
* **make**: compile everything
* **make debug**: compile everything with debug support
* **make clean**: remove all binaries
Just another "nice thing" we deserve.
NOTE: You do NOT want to do this on a populated pnc2 project directory-- it will overwrite files. Only do this on an empty directory.
=====Command-Line Arguments=====
To automate our comparisons, we will be making use of command-line arguments in our programs. As we have yet to really get into arrays, I will provide you same code that you can use that will allow you to utilize them for the purposes of this project.
====header files====
We don't need any extra header files to use command-line arguments, but we will need an additional header file to use the **atoi(3)** function, which we'll use to quickly turn the command-line parameter into an integer, and that header file is **stdlib.h**, so be sure to include it with the others:
#include
#include
====setting up main()====
To accept (or rather, to gain access) to arguments given to your program at runtime, we need to specify two parameters to the main() function. While the names don't matter, the types do.. I like the traditional **argc** and **argv** names, although it is also common to see them abbreviated as **ac** and **av**.
Please declare your main() function as follows:
int main(int argc, char **argv)
The arguments are accessible via the argv array, in the order they were specified:
* argv[0]: program invocation (path + program name)
* argv[1]: our maximum / upper bound
====Simple argument checks====
Although I'm not going to require extensive argument parsing or checking for this project, we should check to see if the minimal number of arguments has been provided:
if (argc < 2) // if less than 2 arguments have been provided
{
fprintf(stderr, "Not enough arguments!\n");
exit(1);
}
====Grab and convert max====
Finally, we need to put the argument representing the maximum value into a variable.
I'd recommend declaring a variable of type **int**.
We will use the **atoi(3)** function to quickly convert the command-line arguments into **int** values:
max = atoi(argv[1]);
And now we can proceed with the rest of our prime implementation.
=====Timing=====
Often times, when checking the efficiency of a solution, a good measurement (especially for comparison), is to time how long the processing takes.
In order to do that in our prime number programs, we are going to use C library functions that obtain the current time, and use it as a stopwatch: we'll grab the time just before starting processing, and then once more when done. The total time will then be the difference between the two (end_time - start_time).
We are going to use the **gettimeofday(2)** function to aid us in this, and to use it, we'll need to do the following:
====header file====
In order to use the **gettimeofday(2)** function in our program, we'll need to include the **sys/time.h** header file, so be sure to add it in with the existing ones:
#include
#include
#include
====timeval variables====
**gettimeofday(2)** uses a **struct timeval** data type, of which we'll need to declare two variables in our programs (one for storing the starting time, and the other for the ending time).
Please declare these with your other variables, up at the top of main() (but still WITHIN main()-- you do not need to declare global variables).
struct timeval time_start; // starting time
struct timeval time_end; // ending time
====Obtaining the time====
To use **gettimeofday(2)**, we merely place it at the point in our code we wish to take the time.
For our prime number programs, you'll want to grab the start time **AFTER** you've declared variables and processed arguments, but **JUST BEFORE** starting the driving loop doing the processing.
That call will look something like this:
gettimeofday(&time_start, 0);
The ending time should be taken immediately after all processing (and prime number output) is completed, and right before we display the timing information to STDERR:
gettimeofday(&time_end, 0);
====Displaying the runtime====
Once we having the starting and ending times, we can display this to STDERR. You'll want this line:
fprintf(stderr, "%10.6lf\n", time_end.tv_sec - time_start.tv_sec + ((time_end.tv_usec - time_start.tv_usec) / 1000000.0));
For clarity sake, that format specifier is "%10.6lf", where the "lf" is "long float", that is **NOT** a number one but a lowercase letter 'ell'.
And with that, we can compute an approximate run-time of our programs. The timing won't necessarily be accurate down to that level of precision, but it will be informative enough for our purposes.
=====Execution=====
Your program output should be as follows (given the specified range):
lab46:~/src/sysprog/pnc2$ ./primesieveoferat 90
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89
0.000088
lab46:~/src/sysprog/pnc2$
The execution of the programs is short and simple- grab the parameters, do the processing, produce the output, and then terminate.
====Performance changes====
You may notice a change with the sieves as compared to the other algorithms you've implemented with respect to performance- there will like be a lower bound of performance, ie you have to exceed a certain threshold before the algorithm truly enters its power band.
=====Check Results=====
If you'd like to compare your implementations, I rigged up a script called **primerun** which you can run.
In order to work, you **MUST** be in the directory where your previous prime programs are. What I do is symlink the sources or copy the binaries into my current directory (pnc2), so I both have access to everything, but everything is still categorized per project.
I'm not yet posting my primerun results; if you implement these correctly the results should speak for themselves.
lab46:~/src/discrete/pnc2$ primerun
COMING SOON
lab46:~/src/discrete/pnc2$
For evaluation, each test is run 4 times, and the resulting time is averaged. During development, I have it set to only run each test once.
If the runtime of a particular prime variant exceeds an upper threshold (likely to be set at 2 seconds), it will be omitted from further tests, and a series of dashes will instead appear in the output.
If you don't feel like waiting, simply hit **CTRL-c** and the script will terminate.
I also include a validation check- to ensure your prime programs are actually producing the correct list of prime numbers. If the check is successful, you will see "OK" displayed beneath in the appropriate column; if unsuccessful, you will see "MISMATCH".
I did implement a tweak for primeodds; it will no longer take so long to do the verification test at the end.
=====Submission=====
To successfully complete this project, the following criteria must be met:
* Code must compile cleanly (no warnings or errors)
* Output must be correct, and match the form given in the sample output above.
* Code must be nicely and consistently indented (you may use the **indent** tool)
* Code must utilize the algorithm(s) presented above.
* **primesieveoferat.c**
* **primesieveofsund.c**
* Code must be commented
* have a properly filled-out comment banner at the top
* be sure to include any compiling instructions
* have at least 20% of your program consist of **//**-style descriptive comments
* Output Formatting (including spacing) of program must conform to the provided output (see above).
* Track/version the source code in a repository
* Submit a copy of your source code to me using the **submit** tool.
To submit this program to me using the **submit** tool, run the following command at your lab46 prompt:
$ submit discrete pnc2 primesieveoferat.c primesieveofsund
Submitting discrete project "pnc2":
-> primesieveoferat.c(OK)
-> primesieveofsund.c(OK)
SUCCESSFULLY SUBMITTED
You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.