projects
projects
Corning Community College
CSCS1320 C/C++ Programming
~~TOC~~
To apply your skills in the implementation of prime number calculating algorithms.
In mathematics, a prime number is a value that is only evenly divisible by 1 and itself; it has no other factors. Numbers that have divisibility/factors are known as composite numbers.
The number 6 is a composite value, as in addition to 1 and 6, it also has the factors of 2 and 3.
The number 17 is a prime number, as no numbers other than 1 and 17 can be evenly divided.
As of yet, there is no quick and direct way of determining the primality of a given number. Instead, we must perform a series of tests to determine if it fails primality (typically by proving it is composite).
This process incurs a considerable amount of processing overhead on the task, so much so that increasingly large values take ever-expanding amounts of time. Often, approaches to prime number calculation involve various algorithms, which offer various benefits (less time) and drawback (more complex code).
Your task for this project is to implement a prime number program using the straightforward, unoptimized brute-force algorithm.
The brute force approach is the simplest to implement (and likely also the worst-performing). We will use it as our baseline (it is nice to have something to compare against).
To perform it, we simply attempt to evenly divide all the values between 2 and one less than the number in question. If any one of them divides evenly, the number is NOT prime, but instead a composite value.
Checking the remainder of a division indicates whether or not a division was clean (having 0 remainder indicates such a state).
For example, the number 11:
11 % 2 = 1 (2 is not a factor of 11) 11 % 3 = 2 (3 is not a factor of 11) 11 % 4 = 3 (4 is not a factor of 11) 11 % 5 = 1 (5 is not a factor of 11) 11 % 6 = 5 (6 is not a factor of 11) 11 % 7 = 4 (7 is not a factor of 11) 11 % 8 = 3 (8 is not a factor of 11) 11 % 9 = 2 (9 is not a factor of 11) 11 % 10 = 1 (10 is not a factor of 11)
Because none of the values 2-10 evenly divided into 11, we can say it passed the test: 11 is a prime number
On the other hand, take 119:
119 % 2 = 1 (2 is not a factor of 119) 119 % 3 = 2 (3 is not a factor of 119) 119 % 4 = 3 (4 is not a factor of 119) 119 % 5 = 4 (5 is not a factor of 119) 119 % 6 = 5 (6 is not a factor of 119) 119 % 7 = 0 (7 is a factor of 119)
Because 7 evenly divided into 119, it failed the test: 119 is not a prime, but instead a composite number.
Even though you have identified the number as a composite, you MUST CONTINUE evaluating the remainder of the values (up to 119-1). It might seem pointless (and it is for a production program), but I want you to see the performance implications this creates.
Some things to keep in mind on your implementation:
The optimized version of brute force will make but one algorithmic change, and that takes place at the moment of identifying a number as composite. So, if we had our 119 example above, and discovered that 7 was a factor:
There is no further need to check the remaining values, as once we have proven the non-primality of a number, the state is set: it is composite. So be sure to use a break statement to terminate the computation loop (will also be a nice boost to runtime).
Make no other optimizations- this first project is to set up some important base line values that we can use for algorithmic comparison later on.
It is your task to write a brute-force prime number calculating program:
Your program should:
For those familiar with the grabit tool on lab46, I have made some skeleton files and a custom Makefile available for this project.
To “grab” it:
lab46:~/src/cprog$ grabit cprog pnc0 make: Entering directory '/var/public/SEMESTER/CLASS/PROJECT' ‘/var/public/SEMESTER/CLASS/PROJECT/Makefile’ -> ‘/home/USERNAME/src/CLASS/PROJECT/Makefile’ ‘/var/public/SEMESTER/CLASS/PROJECT/primebrute.c’ -> ‘/home/USERNAME/src/CLASS/PROJECT/primebrute.c’ ‘/var/public/SEMESTER/CLASS/PROJECT/primebrk.c’ -> ‘/home/USERNAME/src/CLASS/PROJECT/primebrk.c’ make: Leaving directory '/var/public/SEMESTER/CLASS/PROJECT' lab46:~/src/CLASS$ cd pnc0 lab46:~/src/CLASS/pnc0$ ls Makefile primebrute.c primebrk.c lab46:~/src/CLASS/pnc0$
NOTE: You do NOT want to do this on a populated pnc0 project directory– it will overwrite files.
And, of course, your basic compile and clean-up operations:
Just another “nice thing” we deserve.
To automate our comparisons, we will be making use of command-line arguments in our programs. As we have yet to really get into arrays, I will provide you some code that you can use that will allow you to utilize them for the purposes of this project.
We don't need any extra header files to use command-line arguments, but we will need an additional header file to use the atoi(3) function, which we'll use to quickly turn the command-line parameter into an integer, and that header file is stdlib.h, so be sure to include it with the others:
#include <stdio.h> #include <stdlib.h>
To accept (or rather, to gain access) to arguments given to your program at runtime, we need to specify two parameters to the main() function. While the names don't matter, the types do.. I like the traditional argc and argv names, although it is also common to see them abbreviated as ac and av.
Please declare your main() function as follows:
int main(int argc, char **argv)
The arguments are accessible via the argv array, in the order they were specified:
While there are a number of checks we should perform, one of the first should be a check to see if the minimal number of arguments has been provided:
if (argc < 3) // if less than 3 arguments (program_name + quantity + argv[2] == 3) have been provided { fprintf(stderr, "%s: insufficient number of arguments!\n", argv[0]); exit(1); }
Since argv[3] (lower bound) and argv[4] (upper bound) are conditionally optional, it wouldn't make sense to check for them in the overall count. But we can and do still want to stategically utilize argc to determine if an argv[3] or argv[4] is present.
Finally, we need to put the argument representing the maximum value into a variable.
I'd recommend declaring a variable of type int.
We will use the atoi(3) function to quickly convert the command-line arguments into int values:
max = atoi(argv[1]);
And now we can proceed with the rest of our prime implementation.
Often times, when checking the efficiency of a solution, a good measurement (especially for comparison), is to time how long the processing takes.
In order to do that in our prime number programs, we are going to use C library functions that obtain the current time, and use it as a stopwatch: we'll grab the time just before starting processing, and then once more when done. The total time will then be the difference between the two (end_time - start_time).
We are going to use the gettimeofday(2) function to aid us in this, and to use it, we'll need to do the following:
In order to use the gettimeofday(2) function in our program, we'll need to include the sys/time.h header file, so be sure to add it in with the existing ones:
#include <stdio.h> #include <stdlib.h> #include <sys/time.h>
gettimeofday(2) uses a struct timeval data type, of which we'll need to declare two variables in our programs (one for storing the starting time, and the other for the ending time).
Please declare these with your other variables, up at the top of main() (but still WITHIN main()– you do not need to declare global variables).
struct timeval time_start; // starting time struct timeval time_end; // ending time
To use gettimeofday(2), we merely place it at the point in our code we wish to take the time.
For our prime number programs, you'll want to grab the start time AFTER you've declared variables and processed arguments, but JUST BEFORE starting the driving loop doing the processing.
That call will look something like this:
gettimeofday(&time_start, 0);
The ending time should be taken immediately after all processing (and prime number output) is completed, and right before we display the timing information to STDERR:
gettimeofday(&time_end, 0);
Once we have the starting and ending times, we can display this to STDERR. You'll want this line:
fprintf(stderr, "%10.6lf\n", time_end.tv_sec-time_start.tv_sec+((time_end.tv_usec-time_start.tv_usec)/1000000.0));
For clarity sake, that format specifier is “%10.6lf”, where the “lf” is “long float”, that is NOT a number 'one' but a lowercase letter 'ell'.
And with that, we can compute an approximate run-time of our programs. The timing won't necessarily be accurate down to that level of precision, but it will be informative enough for our purposes.
A loop is basically instructing the computer to repeat a section, or block, or code a given amount of times (it can be based on a fixed value– repeat this 4 times, or be based on a conditional value– keep repeating as long as (or while) this value is not 4).
Loops enable us to simplify our code– allowing us to write a one-size-fits all algorithm (provided the algorithm itself can appropriately scale!), where the computer merely repeats the instructions we gave. We only have to write them once, but the computer can do that task any number of times.
Loops can be initially difficult to comprehend because unlike other programmatic actions, they are not single-state in nature– loops are multi-state. What this means is that in order to correctly “see” or visualize a loop, you must analyze what is going on with EACH iteration or cycle, watching the values/algorithm/process slowly march from its initial state to its resultant state. Think of it as climbing a set of stairs… yes, we can describe that action succinctly as “climbing a set of stairs”, but there are multiple “steps” (heh, heh) involved: we place our foot, adjust our balance– left foot, right foot, from one step, to the next, to the next, allowing us to progress from the bottom step to the top step… that process of scaling a stairway is the same as iterating through a loop– but what is important as we implement is what needs to happen each step along the way.
With that said, it is important to be able to focus on the process of the individual steps being taken. What is involved in taking a step? What constitutes a basic unit of stairway traversal? If that unit can be easily repeated for the next and the next (and in fact, the rest of the) steps, we've described the core process of the loop, or what will be iterated a given number of times.
In C and C-syntax influenced languages (C++, Java, PHP, among others), we typically have 3 types of loops:
A for() loop is the most syntactically unique of the loops, so care must be taken to use the proper syntax.
With any loop, we need (at least one) looping variable, which the loop will use to analyze whether or not we've met our looping destination, or to perform another iteration.
A for loop typically also has a defined starting point, a “keep-looping-while” condition, and a stepping equation.
Here's a sample for() loop, in C, which will display the squares of each number, starting at 0, and stepping one at a time, for 8 total iterations:
int i = 0; for (i = 0; i < 8; i++) { fprintf(stdout, "loop #%d ... %d\n", (i+1), (i*i)); }
The output of this code, with the help of our loop should be:
loop #1 ... 0 loop #2 ... 1 loop #3 ... 4 loop #4 ... 9 loop #5 ... 16 loop #6 ... 25 loop #7 ... 36 loop #8 ... 49
Note how we can use our looping variable (i) within mathematical expressions to drive a process along… loops can be of enormous help in this way.
And again, we shouldn't look at this as one step– we need to see there are 8 discrete, distinct steps happening here (when i is 0, when i is 1, when i is 2, … up until (and including) when i is 7).
The loop exits once i reaches a value of 8, because our loop determinant condition states as long as i is less than 8, continue to loop. Once i becomes 8, our looping condition has been satisfied, and the loop will no longer iterate.
The stepping (that third) field is a mathematical expression indicating how we wish for i to progress from its starting state (of being equal to 0) to satisfying the loop's iterating condition (no longer being less than 8).
i++ is a shortcut we can use in C; the longhand (and likely more familiar) equivalent is: i = i + 1
A while() loop isn't as specific about starting and stepping values, really only caring about what condition needs to be met in order to exit the loop (keep looping while this condition is true).
In actuality, anything we use a for loop for can be expressed as a while loop– we merely have to ensure we provide the necessary loop variables and progressions within the loop.
That same loop above, expressed as a while loop, could look like:
int i = 0; while (i < 8) { fprintf(stdout, "loop #%d ... %d\n", (i+1), (i*i)); i = i + 1; // I could have used "i++;" here }
The output of this code should be identical, even though we used a different loop to accomplish the task (try them both out and confirm!)
while() loops, like for() loops, will run 0 or more times; if the conditions enabling the loop to occur are not initially met, they will not run… if met, they will continue to iterate until their looping conditions are met.
It is possible to introduce a certain kind of logical error into your programs using loops– what is known as an “infinite loop”; this is basically where you erroneously provide incorrect conditions to the particular loop used, allowing it to start running, but never arriving at its conclusion, thereby iterating forever.
Another common logical error that loops will allow us to encounter will be the “off by one” error– where the conditions we pose to the loop are incorrect, and the loop runs one magnitude more or less than we had intended. Again, proper debugging of our code will resolve this situation.
The third commonly recognized looping structure in C, the do-while loop is identical to the while() (and therefore also the for()) loop, only it differs in where it checks the looping condition: where for() and while() are “top-driven” loops (ie the test for loop continuance occurs at the top of the loop, before running the code in the loop body), the do-while is a “bottom-driven” loop (ie the test for loop continuance occurs at the bottom of the loop).
The placement of this test determines the minimal number of times a loop can run.
In the case of the for()/while() loops, because the test is at the top- if the looping conditions are not met, the loop may not run at all. It is for this reason why these loops can run “0 or more times”
For the do-while loop, because the test occurs at the bottom, the body of the loop (one full iteration) is run before the test is encountered. So even if the conditions for looping are not met, a do-while will run “1 or more times”.
That may seem like a minor, and possibly annoying, difference, but in nuanced algorithm design, such distinctions can drastically change the layout of your code, potentially being the difference between beautifully elegant-looking solutions and those which appear slightly more hackish. They can BOTH be used to solve the same problems, it is merely the nature of how we choose express the solution that should make one more preferable over the other in any given moment.
I encourage you to intentionally try your hand at taking your completed programs and implementing other versions that utilize the other types of loops you haven't utilized. This way, you can get more familiar with how to structure your solutions and express them. You will find you tend to think in a certain way (from experience, we seem to get in the habit of thinking “top-driven”, and as we're unsure, we tend to exert far more of a need to control the situation, so we tend to want to use for loops for everything– but practicing the others will free your mind to craft more elegant and efficient solutions; but only if you take the time to play and explore these possibilities).
So, expressing that same program in the form of a do-while loop (note the changes from the while):
int i = 0; do { fprintf(stdout, "loop #%d ... %d\n", (i+1), (i*i)); i = i + 1; // again, we could just as easily use "i++;" here } while(i < 8);
In this case, the 0 or more vs. 1 or more minimal iterations wasn't important; the difference is purely syntactical.
With the do-while loop, we start the loop with a do statement.
Also, the do-while is the only one of our loops which NEEDS a terminating semi-colon (;).. please take note of this.
Your program output should be as follows (given the specified quantity):
lab46:~/src/cprog/pnc0$ ./primebrute 24 1 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 0.000165 lab46:~/src/cprog/pnc0$
The execution of the programs is short and simple- grab the parameters, do the processing, produce the output, and then terminate.
Here's an example that should generate an error upon running (based on project specifications):
lab46:~/src/cprog/pnc0$ ./primebrute 32 1 0 ./primebrute: invalid lower bound lab46:~/src/cprog/pnc0$
In this case, the program logic should have detected an invalid condition and bailed out before prime computations even began. No timing data is displayed, because exiting should occur even prior to that.
As indicated above, there is potential interplay with an active quantity and upper bound values. Here is an example where upper bound overrides quantity, resulting in an early termination (ie upper bound is hit before quantity):
lab46:~/src/cprog/pnc0$ ./primebrute 128 1 7 23 7 11 13 17 19 23 0.000125 lab46:~/src/cprog/pnc0$
Also for fun, I set the lower bound to 7, so you'll see computation starts at 7 (vs. the usual 2).
If you'd like to compare your implementations, I rigged up a script called primerun which you can run.
In order to work, you MUST be in the directory where your primebrute and primebrk binaries reside, and must be named as such.
For instance (running on my implementation of prime brute and primebrk):
lab46:~/src/cprog/pnc0$ primerun =================================== qty brute brk =================================== 32 0.000166 0.000120 64 0.000576 0.000201 128 0.002860 0.000532 256 0.012316 0.001969 512 0.057345 0.007655 1024 0.268914 0.031949 2048 1.256834 0.136228 4096 5.880069 0.586694 8192 ---------- 3.065084 16384 ---------- ---------- =================================== verify: OK OK =================================== lab46:~/src/cprog/pnc0$
If the runtime of a particular prime variant exceeds an upper runtime threshold (likely to be set at 2 seconds), it will be omitted from further tests, and a series of dashes will instead appear in the output.
If you don't feel like waiting, simply hit CTRL-c (maybe a couple of times) and the script will terminate.
I also include a validation check- to ensure your prime programs are actually producing the correct list of prime numbers. If the check is successful, you will see “OK” displayed beneath in the appropriate column; if unsuccessful, you will see “MISMATCH”.
Analyze the times you see… do they make sense, especially when comparing the algorithm used and the quantity being processed? These are related to some very important core Computer Science considerations we need to be increasingly mindful of as we design our programs and implement our solutions. Algorithmic complexity and algorithmic efficiency will be common themes in all we do.
To successfully complete this project, the following criteria must be met:
To submit this program to me using the submit tool, run the following command at your lab46 prompt:
$ submit cprog pnc0 primebrute.c primebrk.c Submitting cprog project "pnc0": -> primebrute.c(OK) -> primebrk.c(OK) SUCCESSFULLY SUBMITTED
You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.
What I will be looking for:
52:pnc0:final tally of results (52/52) *:pnc0:primebrute.c performs proper argument checking [4/4] *:pnc0:primebrute.c no negative compiler messages [2/2] *:pnc0:primebrute.c implements only specified algorithm [6/6] *:pnc0:primebrute.c adequate indentation and comments [4/4] *:pnc0:primebrute.c output conforms to specifications [4/4] *:pnc0:primebrute.c primerun runtime tests succeed [6/6] *:pnc0:primebrk.c performs proper argument checking [4/4] *:pnc0:primebrk.c no negative compiler messages [2/2] *:pnc0:primebrk.c implements only specified algorithm [6/6] *:pnc0:primebrk.c adequate indentation and comments [4/4] *:pnc0:primebrk.c output conforms to specifications [4/4] *:pnc0:primebrk.c primerun runtime tests succeed [6/6]