Corning Community College
CSCS2650 Computer Organization
Start exploring algorithm/implementation comparison and optimization with respect to various approaches of computing prime numbers.
Implement two separate, independent programs, one in Vircon32 C, and other in Vircon32 assembly, that:
* display that N/upper bound
The following are reference screenshots of what your implementations should approximate.
You will want to go here to edit and fill in the various sections of the document:
The naive implementation is our baseline: implement with no awareness of potential tweaks, improvements, or optimizations. This should be the worst performing when compared to any optimization.
START TIMEKEEPING NUMBER: FROM 2 THROUGH UPPERBOUND: ISPRIME <- YES FACTOR: FROM 2 THROUGH NUMBER-1: SHOULD FACTOR DIVIDE EVENLY INTO NUMBER: ISPRIME <- NO PROCEED TO NEXT FACTOR SHOULD ISPRIME STILL BE YES: INCREMENT OUR PRIME TALLY PROCEED TO NEXT NUMBER STOP TIMEKEEPING
From the previous dap projects, we learned a lot about stacks and creating subroutines. With that newfound knowledge, we can create a couple of subroutines to help us with this and later pnc projects. When taking a step back and looking at the output we see that we are displaying strings, integers, and floats. So that three subroutines that we can create now and use later. Further, we can also create a subroutine for doing the brute logic of finding prime numbers. I personally started by creating these as separate files but found later on that having them in one big file is just easier to work with.
For the print subroutine, we actually can work with two old programs. Looking at any of the dap code that we wrote, let's grab the starting and end logic of the subroutine that relates pushing and popping registers along with the code that gets our X, Y, and “Thing” from the stack. Now for the brains of our program. What actually happens in the subroutine was created in one of our very first classes. We created a little program that printed Hello World in ASM. We can use that!
Do note that we cannot just simply copy and paste all of this, and it suddenly works, there are some adjustments that need to be made but by this point, you are 90% of the way there.
When printing the seconds that it takes to do the brute force there are generally two approaches that you may want to consider using int's and arithmetic or using the float with some fancy math. But before that, how do you get the seconds? Its relatively easy in assembly all you have to do is record the frames before and after you complete the run. One example of this is something like below.
in R8, TIM_FrameCounter mov startFrames, R8 call _brute_force in R8, TIM_FrameCounter mov endFrames, R8 ;Later in the code mov R8, startFrames mov R9, endFrames isub R9, R8 ; subtract the start frames from end CIF R9 ;changing data type to float fdiv R9, 60 ;dividing by 60 to get the amount of seconds mov seconds, R9 ;store the seconds for use later in the code
int: integer
Float
There is more than one way to print the runtime as an integer with three decimal places. One way is by subtracting the initial frame count (before the brute force) from the final frame count (after the brute force) to get the total frames elapsed, multiply that by 1000, then divide by 60 (since there are 60 frames per second). The result will be the time in milliseconds. Using your print/itoa subroutine, you can print the first three digits, then a decimal place, then the last three digits. There might be a simpler way that makes more sense to you (such as using floats), but this is what I did.
Break on Composite is one of the first, and simplest, improvements that can be made to a prime number generator.
In the Brute Force algorithm, every number between 2 and the current number is checked to see if it is a factor. Even after a factor is found, it keeps going through the rest of the possible factors.
What “Break on Composite” does is that after it finds a factor, it does not need to check further, and can break out of the loop.
Optimizing prime number generation by considering only odd numbers can significantly enhance the efficiency of the algorithm. Since even numbers (except 2) are inherently not prime, eliminating them from consideration reduces the number of iterations required for testing primality. By starting with 3 as the first odd prime number, subsequent odd numbers are generated and tested for primality. This optimization effectively cuts the search space in half, as it eliminates the need to check even numbers, leading to a faster algorithm.
By recognizing that factors of a number occur in pairs, one of which is less than or equal to the square root of the number, we can limit our search space. This method significantly reduces the number of iterations required to test primality, as we only need to examine potential factors up to the square root of the number in question. Consequently, computational resources are conserved, resulting in a faster and more streamlined algorithm.
While sqrt might seem easy at first it is kind of involved if you don't fully understand what is going on. To best understand how to sqrt let's look at the generated function from the .asm you get from complaining your c program.
__function_sqrt: push BP mov BP, SP push R1 mov R0, [BP+2] mov R1, 0.5 pow R0, R1 pop R1
As we can see sqrt actually makes use of the pow instruction. Also note that there is no number directly fed into pow because you can only give registers so you'll need to make use of a temp register here. At this point, you are probably thinking that this is supper easy and isn't that hard to figure out but that is not the case.
When actually implementing this into your program you'll need to make use of cif and cfi. Lets take a look at the following example:
_useSqrt: ; Square rooting number mov R12, R9 ; R12 is a temp register in this case cif R12 ; We then convert that int into a float mov R1, 0.5 ; R1 is also temp register and holds the power pow R12, R1 ; Now calling pow with TWO floats cfi R12 ; converting back to int
So while this is similar to the generated function we needed to make use of cif and cfi and then pow works.
Combining the Break method with the Odds optimization further enhances the efficiency of the prime number generation algorithm. The Break method ensures that after finding a factor, the algorithm breaks out of the loop, thus eliminating unnecessary iterations and conserving computational resources.
By considering only odd numbers and incorporating the Break method, the Break+Odds optimization significantly reduces the computational load. It starts with 3 as the first odd prime number and subsequently generates and tests odd numbers for primality. With the Break method in place, the algorithm terminates the inner loop as soon as a factor is found, leading to a faster and more efficient process of identifying prime numbers.
The implementation follows the structure of the brute force algorithm but with the additional optimization of skipping even numbers and breaking out of the loop upon finding a factor. This combined approach drastically improves the performance of the prime number generation process.
Integrating the Break method with the Square Root optimization further refines the prime number generation algorithm. By breaking out of the loop after finding a factor and limiting the search space to the square root of the current number, unnecessary iterations are avoided, resulting in a more streamlined and efficient algorithm.
The Break+Square Root optimization leverages the approximate square root to determine the upper limit of the inner loop while also incorporating the Break method to terminate the loop early upon finding a factor. This combined approach reduces both the computational complexity and runtime of the algorithm.
The implementation follows a similar structure to the brute force algorithm, with the addition of both the Square Root and Break optimizations. By iterating up to the square root of the current number and breaking out of the loop upon finding a factor, the algorithm achieves significant performance improvements compared to the naive approach.
Break+odds+Sqrt is a culmination of all that you have done up to this point, so lets list out the requirements over the normal brute force.
This is what your c implementation should look like. This piece of code utilizes the approximate square root, which you can see by the j*j.
bool isprime=true; int primecounter=0; for(int i=3; i<n; i+=2){ isprime=true; for(int j=3;j*j<=i; j+=2){ if(i%j==0){ isprime=false; break; } } if(isprime){ primecounter++; } } primecounter++;
The sieve of Eratosthenes is one of the best algorithms for finding prime numbers, you may have noticed that up to this point all the code we have written has a complexity of O(n^2). The soe takes the next step and goes to O(nlog(log(n)).
Here is how the Sieve of Eratosthenes works:
First, you start with 2, and count up to your upper bound. For this example, let's say it is 40:
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Then, you go through the list and remove multiples of 2. After that, you go to the next remaining number, which you now know is prime. Then, you remove multiples of that number, and so on.
To continue from above, 2 is a prime number, so you leave it alone, and remove any multiples of 2:
2 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
Then, you go to the next number: 3. Now you know 3 is a prime number, so you can remove multiples of 3:
2 3 5 7 11 13 17 19 23 25 29 31 35 37
You go through the entire list, and when you get to the end, you are only left with prime numbers:
2 3 5 7 11 13 17 19 23 29 31 37
START TIMEKEEPING NUMBER: FROM 2 THROUGH UPPERBOUND: SHOULD THE NUMBER SLOT BE TRUE: VALUE AT NUMBER IS PRIME, INCREMENT TALLY MULTIPLE: FROM NUMBER+NUMBER THROUGH UPPERBOUND: VALUE AT MULTIPLE IS NOT PRIME MULTIPLE IS MULTIPLE PLUS NUMBER PROCEED TO NEXT MULTIPLE INCREMENT NUMBER PROCEED TO NEXT NUMBER STOP TIMEKEEPING
START TIMEKEEPING NUMBER: FROM 2 THROUGH NUMBER*NUMBER<UPPERBOUND: SHOULD THE NUMBER SLOT BE TRUE: VALUE AT NUMBER IS PRIME, INCREMENT TALLY MULTIPLE: FROM NUMBER*NUMBER THROUGH UPPERBOUND: VALUE AT MULTIPLE IS NOT PRIME MULTIPLE IS MULTIPLE PLUS NUMBER PROCEED TO NEXT MULTIPLE INCREMENT NUMBER PROCEED TO NEXT NUMBER STOP TIMEKEEPING
A potential example using the dokuwiki dataplot plugin for graphing data:
One easy way to get the timing in a consistent way is to have two functions to call, one to start the time and one to end the time. The startCycle/endCycle and startFrame/endFrame all represent different memory addresses.
_time_start: PUSH BP mov BP, SP Push R0 in R0, TIM_CycleCounter mov startCycle, R0 in R0, TIM_FrameCounter mov startFrames, R0 POP R0 mov SP, BP POP BP ret _time_end: PUSH BP mov BP, SP Push R0 in R0, TIM_CycleCounter mov endCycle, R0 in R0, TIM_FrameCounter mov endFrames, R0 POP R0 mov SP, BP POP BP ret
While Sieve of Eratosthenes might be easier to implement compared to pnc1 it does require knowing how to create an array in assembly which can be challenging if it has been a while sense you messed around with them. First will take a look at the array that we want to create in C and convert it into asm.
Here is the set up of the array in C:
int [8193]primeAr; // Creating array // Filling primeAr with numbers from 2 to upperRange for (int pop=2; pop <= upperBound; pop++) { primeAr[pop]=pop; }
Lets look at what is happening here, so we have created an array called primeAr and we have a loop that starts at 2 and goes till upperBound which could be 1024, 2048, and so on. Inside the loop, we are specifying the index at a given point in the array and assigning it pop. This is populating our array going 2,3,4,5,…upperBound which will be important for the logic later down the line.
Now lets look at that in asm:
mov R8, primeAr ; Starting point goes into R8 iadd R8, 2 ; Incrementing R8 by 2 mov R9, 2 ; starting point/number (pop in C) _bLoop: ; Loop condition mov R13, R9 ile R13, R2 jf R13, _bLoopEnd mov [R8], R9 ; R9 moved into R8 array at current index iadd R8, 1 ; Increment memory address iadd R9, 1 ; Increment loop jmp _bLoop ; Back to start of loop _bLoopEnd:
As you can see the logic is very similar but just knowing how to write it might be the hard part but after doing a couple look overs and experimenting it will click.
Raw times for my code:
C | Asm | |
---|---|---|
1024 | .48333 | 0.333 |
2048 | 1.95 | 1.399 |
4096 | 7.83333 | 5.599 |
8192 | 31.3166 | 22.366 |
To be successful in this project, the following criteria (or their equivalent) must be met:
Let's say you have completed work on the project, and are ready to submit, you would do the following:
lab46:~/src/SEMESTER/DESIG/PROJECT$ submit DESIG PROJECT file1 file2 file3 ... fileN
You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.
I'll be evaluating the project based on the following criteria:
260:pnc0:final tally of results (260/260) *:pnc0:submitted C and assembly implementations [26/26] *:pnc0:each implementation builds cleanly [26/26] *:pnc0:output conforms to specifications [52/52] *:pnc0:processing is correct, and to specifications [52/52] *:pnc0:no optimizations or improvements on the process [26/26] *:pnc0:graph produced from timing data produced [26/26] *:pnc0:graph posted to discord and documentation page [26/26] *:pnc0:timing data is the taken out to 3 decimal places [26/26]