Corning Community College CSCS1320 C/C++ Programming ~~TOC~~ ======Project: OPTIMIZING ALGORITHMS - PRIME NUMBER CALCULATION (pnc1)====== =====Objective===== To apply your skills in algorithmic optimization through the implementation of improved prime number calculating programs. =====Algorithmic Complexity===== A concept in Computer Science curriculum is the notion of computational/algorithmic complexity. Basically, a solution to a problem exists on a spectrum of efficiency (typically constrained by time vs. space): if optimizing for time, the code size tends to grow. Additionally, if optimizing for time (specifically to reduce the amount of time taken), strategic approaches are taken to reduce unnecessary or redundant operations (yet still achieving the desired end results). This project will endeavor to introduce you to the notion that the algorithms and constructs you use in coding your solution can and do make a difference to the overall runtime of your code. =====Optimizing the prime number calculation===== We should be fairly familiar with the process of computing primes by now, which is an essential beginning step to accomplish before pursuing optimization. Following will be some optimizations I'd like you to implement (as separate programs) so we can analyze the differences in approaches, and how they influence runtimes. ====odds checking (primebrkodd)==== Some optimizations can be the result of sheer common sense observations. For instance, with the exception of 2, all primes are odd numbers. So does it make sense to check an even number for primality? Hopefully you can see that no, it doesn't. And can we predict even numbers? Yes, we can: they occur every other number. Therefore, we can start our number checking at 3, and skip 2 values each time (3, 5, 7, 9, 11, etc.). To make our output correct, we would simply display the "2" outright. We know it is prime, and will make that assumption with this program. This program should be an optimization based on your **primebrk** program from pnc0. ====square root trick (primebrksrt)==== An optimization to the computation of prime numbers is the square root trick. Basically, if we've processed numbers up to the square root of the number we're testing, and none have proven to be evenly divisible, we can also assume primality and bail out. The C library has a **sqrt()** function available through including the **math.h** header file, and linking against the math library at compile time (add **-lm** to your gcc line). To use **sqrt()**, we pass in the value we wish to obtain the square root of, and assign the result to an **int**: int x = 25; int y = 0; y = sqrt(x); // y should be 5 as a result For instance, the number 37 (using the square root optimization), we find the square root (whole number) of 37 is 6, so we only need to check 2-6: 37 % 2 = 1 (2 is not a factor of 37) 37 % 3 = 1 (3 is not a factor of 37) 37 % 4 = 1 (4 is not a factor of 37) 37 % 5 = 2 (5 is not a factor of 37) 37 % 6 = 1 (6 is not a factor of 37) Because none of these values evenly divides, we can give 37 a pass: **it is a prime** This will dramatically improve the runtime, and offers a nice comparison against our brute force baseline. NOTE: You will be reverting to checking all numbers (both even and odd) with this program. This program should be an optimization based on your **primebrk** program from pnc0. ====sqrt() + odds (primebrkoddsrt)==== In the previous program we used **sqrt()** against all the values, even or odd. This program will eliminate the even values, checking only the odds. This program should be an optimization of your **primebrksrt** program. ====sqrt()-less square root (primebrksrtopt)==== An optimization to the previous process, which used **sqrt()**, this variation will do the exact same thing, but without using the **sqrt()** function. It will approximate the square root. We know that a square root (especially a whole numbered square root), is when we have whole number factors that are squared. But in addition, only considering the whole number aspect of the square root, we start seeing series of values with the same whole square root value: lab46:~$ count=0; for ((i=2; i<152; i++)); do printf "[%3d] %2d " "${i}" `echo "sqrt($i)" | bc -q`; let count=count+1; if [ "${count}" -eq 10 ]; then echo; count=0; fi; done; echo [ 2] 1 [ 3] 1 [ 4] 2 [ 5] 2 [ 6] 2 [ 7] 2 [ 8] 2 [ 9] 3 [ 10] 3 [ 11] 3 [ 12] 3 [ 13] 3 [ 14] 3 [ 15] 3 [ 16] 4 [ 17] 4 [ 18] 4 [ 19] 4 [ 20] 4 [ 21] 4 [ 22] 4 [ 23] 4 [ 24] 4 [ 25] 5 [ 26] 5 [ 27] 5 [ 28] 5 [ 29] 5 [ 30] 5 [ 31] 5 [ 32] 5 [ 33] 5 [ 34] 5 [ 35] 5 [ 36] 6 [ 37] 6 [ 38] 6 [ 39] 6 [ 40] 6 [ 41] 6 [ 42] 6 [ 43] 6 [ 44] 6 [ 45] 6 [ 46] 6 [ 47] 6 [ 48] 6 [ 49] 7 [ 50] 7 [ 51] 7 [ 52] 7 [ 53] 7 [ 54] 7 [ 55] 7 [ 56] 7 [ 57] 7 [ 58] 7 [ 59] 7 [ 60] 7 [ 61] 7 [ 62] 7 [ 63] 7 [ 64] 8 [ 65] 8 [ 66] 8 [ 67] 8 [ 68] 8 [ 69] 8 [ 70] 8 [ 71] 8 [ 72] 8 [ 73] 8 [ 74] 8 [ 75] 8 [ 76] 8 [ 77] 8 [ 78] 8 [ 79] 8 [ 80] 8 [ 81] 9 [ 82] 9 [ 83] 9 [ 84] 9 [ 85] 9 [ 86] 9 [ 87] 9 [ 88] 9 [ 89] 9 [ 90] 9 [ 91] 9 [ 92] 9 [ 93] 9 [ 94] 9 [ 95] 9 [ 96] 9 [ 97] 9 [ 98] 9 [ 99] 9 [100] 10 [101] 10 [102] 10 [103] 10 [104] 10 [105] 10 [106] 10 [107] 10 [108] 10 [109] 10 [110] 10 [111] 10 [112] 10 [113] 10 [114] 10 [115] 10 [116] 10 [117] 10 [118] 10 [119] 10 [120] 10 [121] 11 [122] 11 [123] 11 [124] 11 [125] 11 [126] 11 [127] 11 [128] 11 [129] 11 [130] 11 [131] 11 [132] 11 [133] 11 [134] 11 [135] 11 [136] 11 [137] 11 [138] 11 [139] 11 [140] 11 [141] 11 [142] 11 [143] 11 [144] 12 [145] 12 [146] 12 [147] 12 [148] 12 [149] 12 [150] 12 [151] 12 Or, if perhaps we instead order by square root value: lab46:~$ oldsqrt=$(echo "sqrt(2)" | bc -q); for ((i=2; i<49; i++)); do newsqrt=$(echo "sqrt($i)" | bc -q); if [ "${newsqrt}" -ne "${oldsqrt}" ]; then echo; fi; printf "[%3d] %2d " "${i}" "${newsqrt}"; oldsqrt="${newsqrt}"; done; echo [ 2] 1 [ 3] 1 [ 4] 2 [ 5] 2 [ 6] 2 [ 7] 2 [ 8] 2 [ 9] 3 [ 10] 3 [ 11] 3 [ 12] 3 [ 13] 3 [ 14] 3 [ 15] 3 [ 16] 4 [ 17] 4 [ 18] 4 [ 19] 4 [ 20] 4 [ 21] 4 [ 22] 4 [ 23] 4 [ 24] 4 [ 25] 5 [ 26] 5 [ 27] 5 [ 28] 5 [ 29] 5 [ 30] 5 [ 31] 5 [ 32] 5 [ 33] 5 [ 34] 5 [ 35] 5 [ 36] 6 [ 37] 6 [ 38] 6 [ 39] 6 [ 40] 6 [ 41] 6 [ 42] 6 [ 43] 6 [ 44] 6 [ 45] 6 [ 46] 6 [ 47] 6 [ 48] 6 We see that the square root of 36 is 6, but so is the square root of 37, 38, 39... etc. up until we hit 49 (where the whole number square root increments to 7). Therefore, if we were checking 42 to be prime, we'd only have to check up to 6. We don't need a **sqrt()** function to tell us this, we can determine the approximate square root point ourselves- by squaring the current factor being tested, and so long as it hasn't exceeded the value we're checking, we know to continue. There are some important lessons at play here: * approximation can be powerful * approximation can result in a simpler algorithm, improving runtime * **sqrt()** is more complex than you may be aware, not to mention it is in a function. By avoiding that function call, we eliminate some overhead, and that can make a difference in runtime performance. NOTE: Again, for comparison sake, check ALL numbers (even and odd) for this variant. This program should be an optimization of your **primebrksrt** program. Depending on how you implement this and the original sqrt() algorithms, this version may have a noticeable performance difference. If, on the other hand, you were really optimal in both implementations, the performance difference may be narrower (if negligible). ====sqrt()-less odds (primebrkoddsrtopt)==== And, to round out our analysis, enhance the optimized sqrt variant to only check odd values. This program should be an optimization of your **primebrksrtopt** program. =====Program===== It is your task to write some optimized prime number calculating programs: - **primebrkodd.c**: checking only odd values - **primebrksrt.c**: for your **sqrt()**-based implementation - **primebrkoddsrt.c**: **sqrt()**-based implementation only checking odds - **primebrksrtopt.c**: for your **sqrt()**-less square root approximated implementation - **primebrkoddsrtopt.c**: **sqrt()**-less square root only checking odds Your program should: * obtain 2-4 parameters from the command-line (see **command-line arguments** section below). * check to make sure the user indeed supplied enough parameters, and exit with an error message if not. * argv[1]: maximum quantity of primes to calculate (your program should run until it discovers **that** many primes). * this value should be an integer value, greater than or equal to 0. * if argv[1] is 0, disable the quantity check, and rely on provided lower and upper bounds (argv[4] would be required in this case). * argv[2]: reserved for future compatibility; for now, assume it is **1**. * argv[3]: **conditionally optional** lower bound (starting value). Most of the time, this will probably be **2**, but should be a positive integer greater than or equal to 2. This defines where you program will start its prime quantity check from. * if omitted, assume a lower bound of **2**. * if you desired to specify an upper bound (argv[4]), you obviously MUST provide the lower bound argument under this scheme. * argv[4]: **conditionally optional** upper bound (ending value). If provided, this is the ending value you'd like to check to. * If doing a quantity run (argv[1] NOT 0), this value isn't necessary. * If doing a quantity run AND you specify an upper bound, whichever condition is achieved first dictates program termination. That is, upper bound could override quantity (if it is achieved before quantity), and quantity can override the upper bound (if it is achieved before reaching the specified upper bound). * for each argument: you should do a basic check to ensure the user complied with this specification, and exit with a unique error message (displayed to STDERR) otherwise: * for insufficient quantity of arguments, display: **PROGRAM_NAME: insufficient number of arguments!** * for invalid argv[1], display: **PROGRAM_NAME: invalid quantity!** * for invalid argv[2], display: **PROGRAM_NAME: invalid value!** * for invalid argv[3], display: **PROGRAM_NAME: invalid lower bound!** * if argv[3] is not needed, ignore (no error displayed not forced exit, as it is acceptable defined behavior). * for invalid argv[4], display: **PROGRAM_NAME: invalid upper bound!** * if argv[4] is not needed, ignore (no error displayed nor forced exit, as it is acceptable defined behavior). * In these error messages, **PROGRAM_NAME** is the name of the program being run; this can be accessed as a string stored in **argv[0]**. * please take note in differences in run-time, contemplating the impact the various algorithms/optimizations have on performance. * start your stopwatch (see **timing** section below). * perform the correct algorithm against the input(s) given. * display to STDOUT (file pointer **stdout**) the prime numbers calculated. * stop your stopwatch. Calculate the time that has transpired (ending time minus starting time). * a further coding restriction: in each program, you are not allowed to use a given loop type (for(), while(), do-while()) more than once! If you find you need more than one loop in a program, they **CANNOT** be the same type (this is to get you exposed to them and thinking differently). * output the processing run-time to STDERR (file pointer **stderr**). * your output **MUST** conform to the example output in the **execution** section below. This is also a test to see how well you can implement to specifications. Basically: * as primes are being displayed, they are space-separated (first prime hugs the left margin), and when all said and done, a newline is issued. * the timing information will be displayed in accordance to code I will provide below (see the **timing** section). =====Grabit Integration===== For those familiar with the **grabit** tool on lab46, I have made some skeleton files and a custom **Makefile** available for this project. To "grab" it: lab46:~/src/cprog$ grabit cprog pnc1 make: Entering directory '/var/public/SEMESTER/CLASS/PROJECT' ‘/var/public/SEMESTER/CLASS/PROJECT/Makefile’ -> ‘/home/USERNAME/src/CLASS/PROJECT/Makefile’ ‘/var/public/SEMESTER/CLASS/PROJECT/primebrkodd.c’ -> ‘/home/USERNAME/src/CLASS/PROJECT/primebrkodd.c’ ‘/var/public/SEMESTER/CLASS/PROJECT/primebrkoddsrt.c’ -> ‘/home/USERNAME/src/CLASS/PROJECT/primebrkoddsrt.c’ ‘/var/public/SEMESTER/CLASS/PROJECT/primebrkoddsrtopt.c’ -> ‘/home/USERNAME/src/CLASS/PROJECT/primebrkoddsrtopt.c’ ‘/var/public/SEMESTER/CLASS/PROJECT/primebrksrt.c’ -> ‘/home/USERNAME/src/CLASS/PROJECT/primebrksrt.c’ ‘/var/public/SEMESTER/CLASS/PROJECT/primebrksrtopt.c’ -> ‘/home/USERNAME/src/CLASS/PROJECT/primebrksrtopt.c’ make: Leaving directory '/var/public/SEMESTER/CLASS/PROJECT' lab46:~/src/CLASS$ cd pnc1 lab46:~/src/CLASS/pnc1$ ls Makefile primebrkodd.c primebrkoddsrt.c primebrkoddsrtopt.c primebrksrt.c primebrksrtopt.c lab46:~/src/CLASS/pnc1$ NOTE: You do NOT want to do this on a populated pnc0 project directory-- it will overwrite files. And, of course, your basic compile and clean-up operations: * **make**: compile everything * **make debug**: compile everything with debug support * **make clean**: remove all binaries Just another "nice thing" we deserve. Furthermore, if your pnc1/ project directory is next to your pnc0/ directory, each containing those project's specific prime variants, you can symlink them into the current project directory with a **make link**: lab46:~/src/CLASS/pnc1$ make link ‘./primebrute.c’ -> ‘../pnc0/primebrute.c’ ‘./primebrk.c’ -> ‘../pnc0/primebrk.c’ lab46:~/src/CLASS/pnc1$ =====Command-Line Arguments===== To automate our comparisons, we will be making use of command-line arguments in our programs. As we have yet to really get into arrays, I will provide you same code that you can use that will allow you to utilize them for the purposes of this project. ====header files==== We don't need any extra header files to use command-line arguments, but we will need an additional header file to use the **atoi(3)** function, which we'll use to quickly turn the command-line parameter into an integer, and that header file is **stdlib.h**, so be sure to include it with the others: #include #include ====setting up main()==== To accept (or rather, to gain access) to arguments given to your program at runtime, we need to specify two parameters to the main() function. While the names don't matter, the types do.. I like the traditional **argc** and **argv** names, although it is also common to see them abbreviated as **ac** and **av**. Please declare your main() function as follows: int main(int argc, char **argv) The arguments are accessible via the argv array, in the order they were specified: * argv[0]: program invocation (path + program name) * argv[1]: our maximum / upper bound ====Simple argument checks==== Although I'm not going to require extensive argument parsing or checking for this project, we should check to see if the minimal number of arguments has been provided: if (argc < 2) // if less than 2 arguments have been provided { fprintf(stderr, "Not enough arguments!\n"); exit(1); } ====Grab and convert max==== Finally, we need to put the argument representing the maximum value into a variable. I'd recommend declaring a variable of type **int**. We will use the **atoi(3)** function to quickly convert the command-line arguments into **int** values: max = atoi(argv[1]); And now we can proceed with the rest of our prime implementation. =====Timing===== Often times, when checking the efficiency of a solution, a good measurement (especially for comparison), is to time how long the processing takes. In order to do that in our prime number programs, we are going to use C library functions that obtain the current time, and use it as a stopwatch: we'll grab the time just before starting processing, and then once more when done. The total time will then be the difference between the two (end_time - start_time). We are going to use the **gettimeofday(2)** function to aid us in this, and to use it, we'll need to do the following: ====header file==== In order to use the **gettimeofday(2)** function in our program, we'll need to include the **sys/time.h** header file, so be sure to add it in with the existing ones: #include #include #include ====timeval variables==== **gettimeofday(2)** uses a **struct timeval** data type, of which we'll need to declare two variables in our programs (one for storing the starting time, and the other for the ending time). Please declare these with your other variables, up at the top of main() (but still WITHIN main()-- you do not need to declare global variables). struct timeval time_start; // starting time struct timeval time_end; // ending time ====Obtaining the time==== To use **gettimeofday(2)**, we merely place it at the point in our code we wish to take the time. For our prime number programs, you'll want to grab the start time **AFTER** you've declared variables and processed arguments, but **JUST BEFORE** starting the driving loop doing the processing. That call will look something like this: gettimeofday(&time_start, 0); The ending time should be taken immediately after all processing (and prime number output) is completed, and right before we display the timing information to STDERR: gettimeofday(&time_end, 0); ====Displaying the runtime==== Once we having the starting and ending times, we can display this to STDERR. You'll want this line: fprintf(stderr, "%10.6lf\n", time_end.tv_sec - time_start.tv_sec + ((time_end.tv_usec - time_start.tv_usec) / 1000000.0)); For clarity sake, that format specifier is "%10.6lf", where the "lf" is "long float", that is **NOT** a number one but a lowercase letter 'ell'. And with that, we can compute an approximate run-time of our programs. The timing won't necessarily be accurate down to that level of precision, but it will be informative enough for our purposes. =====Execution===== Your program output should be as follows (given the specified range): lab46:~/src/cprog/pnc1$ ./primebrkodd 32 1 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 0.000133 lab46:~/src/cprog/pnc1$ The execution of the programs is short and simple- grab the parameters, do the processing, produce the output, and then terminate. =====Check Results===== If you'd like to compare your implementations, I rigged up a script called **primerun** which you can run. In order to work, you **MUST** be in the directory where your **primesqrt**, **primesqrtopt** and **primemap** binaries reside, and must be named as such. You'll also want to copy in your **primebrute** and **primebrk** binaries to truly get the full picture. For instance (running on my implementations of the programs): lab46:~/src/cprog/pnc1$ primerun ==================================================================================================== qty brute brk brkodd brksrt brkoddsrt brksrtopt brkoddsrtopt ==================================================================================================== 32 0.000156 0.000092 0.000113 0.000115 0.000082 0.000088 0.000118 64 0.000605 0.000195 0.000163 0.000105 0.000093 0.000100 0.000094 128 0.002818 0.000531 0.000314 0.000170 0.000131 0.000148 0.000118 256 0.012317 0.001956 0.001109 0.000322 0.000244 0.000283 0.000200 512 0.057478 0.007690 0.003925 0.000766 0.000459 0.000634 0.000427 1024 0.269012 0.031934 0.016148 0.001862 0.001086 0.001555 0.000956 2048 1.257000 0.136199 0.068365 0.004814 0.002638 0.004175 0.002360 4096 5.879843 0.586778 0.293167 0.012594 0.006764 0.011226 0.006157 8192 ---------- 3.064616 1.526329 0.036844 0.019395 0.034097 0.018203 16384 ---------- ---------- 6.965932 0.111411 0.058142 0.105525 0.055794 32768 ---------- ---------- ---------- 0.308669 0.160172 0.296108 0.154995 65536 ---------- ---------- ---------- 0.836940 0.430479 0.810691 0.420776 131072 ---------- ---------- ---------- 2.275267 1.161227 2.223852 1.141030 262144 ---------- ---------- ---------- ---------- 3.162367 ---------- 3.121183 524288 ---------- ---------- ---------- ---------- ---------- ---------- ---------- ==================================================================================================== verify: OK OK OK OK OK OK OK ==================================================================================================== lab46:~/src/cprog/pnc1$ If the runtime of a particular prime variant exceeds an upper threshold (likely to be set at 2 seconds), it will be omitted from further tests, and a series of dashes will instead appear in the output. If you don't feel like waiting, simply hit **CTRL-c** and the script will terminate. I also include a validation check- to ensure your prime programs are actually producing the correct list of prime numbers. If the check is successful, you will see "OK" displayed beneath in the appropriate column; if unsuccessful, you will see "MISMATCH". Analyze the times you see... do they make sense, especially when comparing the algorithm used and the quantity being processed? These are related to some very important core Computer Science considerations we need to be increasingly mindful of as we design our programs and implement our solutions. Algorithmic complexity and algorithmic efficiency will be common themes in all we do. =====Submission===== To successfully complete this project, the following criteria must be met: * Code must compile cleanly (no warnings or errors) * Output must be correct, and match the form given in the sample output above. * Code must be nicely and consistently indented (you may use the **indent** tool) * Code must utilize the algorithm(s) presented above. * **primebrkodd.c** * **primebrksrt.c** * **primebrkoddsrt.c** * **primebrksrtopt.c** * **primebrkoddsrtopt.c** * Code must be commented * have a properly filled-out comment banner at the top * be sure to include any compiling instructions * have at least 20% of your program consist of **//**-style descriptive comments * Output Formatting (including spacing) of program must conform to the provided output (see above). * Track/version the source code in a repository * Submit a copy of your source code to me using the **submit** tool. To submit this program to me using the **submit** tool, run the following command at your lab46 prompt: $ submit cprog pnc1 primebrkodd.c primebrksrt.c primebrkoddsrt.c primebrksrtopt.c primebrkoddsrtopt.c Submitting cprog project "pnc1": -> primebrkodd.c(OK) -> primebrksrt.c(OK) -> primebrkoddsrt.c(OK) -> primebrksrtopt.c(OK) -> primebrkoddsrtopt.c(OK) SUCCESSFULLY SUBMITTED You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches. What I will be looking for: 78:pnc1:final tally of results (78/78) *:pnc1:submit all programs correctly perform argument checking [3/3] *:pnc1:primebrkodd.c no negative compiler messages [2/2] *:pnc1:primebrkodd.c implements only specified algorithm [4/4] *:pnc1:primebrkodd.c adequate indentation and comments [3/3] *:pnc1:primebrkodd.c output conforms to specifications [3/3] *:pnc1:primebrkodd.c primerun runtime tests succeed [3/3] *:pnc1:primebrksrt.c no negative compiler messages [2/2] *:pnc1:primebrksrt.c implements only specified algorithm [4/4] *:pnc1:primebrksrt.c adequate indentation and comments [3/3] *:pnc1:primebrksrt.c output conforms to specifications [3/3] *:pnc1:primebrksrt.c primerun runtime tests succeed [3/3] *:pnc1:primebrkoddsrt.c no negative compiler messages [2/2] *:pnc1:primebrkoddsrt.c implements only specified algorithm [4/4] *:pnc1:primebrkoddsrt.c adequate indentation and comments [3/3] *:pnc1:primebrkoddsrt.c output conforms to specifications [3/3] *:pnc1:primebrkoddsrt.c primerun runtime tests succeed [3/3] *:pnc1:primebrksrtopt.c no negative compiler messages [2/2] *:pnc1:primebrksrtopt.c implements only specified algorithm [4/4] *:pnc1:primebrksrtopt.c adequate indentation and comments [3/3] *:pnc1:primebrksrtopt.c output conforms to specifications [3/3] *:pnc1:primebrksrtopt.c primerun runtime tests succeed [3/3] *:pnc1:primebrkoddsrtopt.c no negative compiler messages [2/2] *:pnc1:primebrkoddsrtopt.c implements only specified algorithm [4/4] *:pnc1:primebrkoddsrtopt.c adequate indentation and comments [3/3] *:pnc1:primebrkoddsrtopt.c output conforms to specifications [3/3] *:pnc1:primebrkoddsrtopt.c primerun runtime tests succeed [3/3]