User Tools

Site Tools


playground:playground

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
playground:playground [2021/11/02 21:20] pleblancplayground:playground [2023/09/12 19:29] – [Objective] rford5
Line 1: Line 1:
-====== PlayGround ======+discrete fall2022 eoce 0x3 
 +======================== 
 + 
 +=====Objective===== 
 +To explore  the cnv0/cnv1  programs from  a different  perspective: using 
 +only ONE loop to drive central processing. 
 + 
 + 
 + 
 +<WRAP center round box 60%> 
 +messing with this 
 +</WRAP> 
 + 
 +<blockquote>messing some more with this</blockquote> 
 + 
 +=====Background===== 
 +In  mathematics, a  **prime**  number  is a  value  that  is only  evenly 
 +divisible  by  1 and  itself;  it  has just  that  one  pair of  factors, 
 +no  others.  Numbers that  have  divisibility/factors  are classified  as 
 +**composite** numbers. 
 + 
 +The number **6** is a **composite** number, as in addition to 1 and 6, it 
 +also has the factors of 2 and 3. 
 + 
 +The number  **17**, however, is a  **prime** number, as no  numbers other 
 +than 1 and 17 can be evenly divided into it. 
 + 
 +=====Calculating the primality of a number===== 
 +As of yet, there is no quick  and direct way of determining the primality 
 +of  a  given number.  Instead,  we  must perform  a  series  of tests  to 
 +determine if it fails primality (typically by proving it is composite). 
 + 
 +This process incurs  a considerable amount of processing  overhead on the 
 +task,  so much  so  that increasingly  large  values take  ever-expanding 
 +amounts of  time. Often, approaches  to prime number  calculation involve 
 +various algorithms, which offer various benefits (less time) and drawback 
 +(more complex code). 
 + 
 +Your task for  this project is to implement a  prime number program using 
 +the straightforward, unoptimized  brute-force algorithm, which determines 
 +the primality of a number in a "trial by division" approach. 
 + 
 +=====Main algorithm: brute force===== 
 +The brute force  approach is the simplest to implement  (although at some 
 +cost in the form of time). 
 + 
 +As  we  will  be  looking   to  do  some  time/performance  analysis  and 
 +comparisons, it  is often good to  have a baseline. This  program will be 
 +it. 
 + 
 +To perform the process of computing  the primality of a number, we simply 
 +attempt to evenly divide  all the values between 2 and  one less than the 
 +number in  question. If  any one  of them divides  evenly, the  number is 
 +**NOT** prime, but instead composite. 
 + 
 +Checking  the **remainder**  of a  division  indicates whether  or not  a 
 +division was clean (having 0 remainder indicates such a state). 
 + 
 +For example, the number 11:
  
-D<sub>o</sub>//  
- <sub>o</sub> 
 <code> <code>
-class Foo+11 % 2 = 1 (2 is not a factor of 11) 
 +11 % 3 = 2 (3 is not a factor of 11) 
 +11 % 4 = 3 (4 is not a factor of 11) 
 +11 % 5 = 1 (5 is not a factor of 11) 
 +11 % 6 = 5 (6 is not a factor of 11) 
 +11 % 7 = 4 (7 is not a factor of 11) 
 +11 % 8 = 3 (8 is not a factor of 11) 
 +11 % 9 = 2 (9 is not a factor of 11) 
 +11 % 10 = 1 (10 is not a factor of 11)
 </code> </code>
 +
 +Because none  of the values  2-10 evenly divided into  11, we can  say it
 +passed the test: **11 is a prime number**
 +
 +On the other hand, take 119:
 +
 +<code>
 +119 % 2 = 1 (2 is not a factor of 119)
 +119 % 3 = 2 (3 is not a factor of 119)
 +119 % 4 = 3 (4 is not a factor of 119)
 +119 % 5 = 4 (5 is not a factor of 119)
 +119 % 6 = 5 (6 is not a factor of 119)
 +119 % 7 = 0 (7 is a factor of 119)
 +119 % 8 = 7
 +119 % 9 = 2
 +119 % 10 = 9
 +119 % 11 = 9
 +119 % 12 = 11
 +119 % 13 = 2
 +...
 +</code>
 +
 +Because, during our range of testing every value from 2-118, we find that
 +7 evenly divides into 119, it failed  the test: 119 is **not** prime, but
 +is instead a composite number.
 +
 +Please NOTE: Even once a number  is identified as composite, your program
 +MUST **CONTINUE** evaluating the remainder of the values (up to 119-1, or
 +118). It might seem pointless (and it is for a production program), but I
 +want you to see the performance implications this creates.
 +
 +====algorithm====
 +Some things to keep in mind on your implementation:
 +
 +  * you will want to have exactly ONE  loop in the central processing for
 +    this program.
 +
 +    * you will need to flatten  the conditions that previously took place
 +      in the nested loop
 +
 +    * be mindful of what the underlying conditions are
 +
 +  * you  know the starting  value and  the terminating condition,  so you
 +    have a clear starting and ending point to work with.
 +
 +  * I do **NOT** want to see ambiguous, one-letter variables used in your
 +    implementation(s). Please use //meaningful// variable names.
 +
 +    * Some good examples of variable names would be:
 +
 +      * **number**: the number being tested
 +
 +      * **factor**: the  value being  divided  into number  to  test  for
 +        primality
 +
 +      * **step**: the rate by which some variable is changing
 +
 +      * **qty**: the count of the current tally of primes
 +
 +      * **max**: the maximum count we seek
 +
 +      * **start**: a value we are starting at
 +
 +      * **lower**: a lower bound
 +
 +      * **upper**: an upper bound
 +
 +      * see how  much more readable and meaningful  these are, especially
 +        as compared  to **a**, **i**, **n**, **x**?  You may even find it
 +        helps with debugging and understanding your code better.
 +
 +  * let  the  loop drive  the overall  process. Identify  prime/composite
 +    status separate from loop terminating conditions.
 +
 +    * and remember,  the baseline brute force algorithm may well identify
 +      a value as composite, but won't terminate the loop.
 +
 +  * your  timing  should  start  before  the  loop (just  AFTER  argument
 +    processing),  and  terminate  immediately  following  the terminating
 +    output newline outside the loops.
 +
 +  * you  may **NOT** split  **qty** and **range** functionality  into two
 +    separate  code  blocks (ie have two sets  of two loops). Only the one
 +    set as indicated.
 +
 +=====prime algorithm optimizations=====
 +To  give  us  a  decent  appreciation  of  the  subtleties  of  algorithm
 +development  in a  theme of  programs,  I have  identified the  following
 +optimizations that we will be implementing.
 +
 +The  optimizations we  will be  implementing in  this section include:
 +
 +  * **break  on   composite**  - once  a  tested  number  is proven to be
 +    composite, there is no need to continue processing: break out  of the
 +    factor check logic and proceed to the next number
 +
 +  * **mapping factors of 6** - it  turns out that, aside from the initial
 +    primes of **2** and **3**, that **all** prime numbers fall to a +1 or
 +    -1 off a factor of six  (there is an algorithm for this: **6a+/-1**).
 +    This optimization  will utilize  this property, only  testing numbers
 +    +/-1 off of factors of 6 (how might this impact overall processing?)
 +
 +  * **odds-only  checking**  -  aside  from **2**,  **all**  other  prime
 +    numbers are odd. Therefore, there is zero need to perform a composite
 +    check on  an even  number, allowing  us to  focus exclusively  on odd
 +    values (luckily, they seem to occur in a predictable pattern).
 +
 +  * **sqrt() trick** - mathematically it has  been shown that if a number
 +    has any  evenly divisible factors, at  least one half of  that factor
 +    pair will occur by the square root point of the number being tested.
 +
 +  * **sqrt()-less square  root approximation**  - **sqrt()**,  a function
 +    in  the  math  library,  does  an  industrial  strength  square  root
 +    calculation.  We  don' need  that, merely  a  whole  integer  value
 +    corresponding to the approximate square  root. Here we will implement
 +    our  own  logic   to  approximate  square  root,   hopefully  with  a
 +    considerable performance impact.
 +
 +Unless  specified  in  the  encoded  name,  your  algorithm  should  only
 +implement the algorithm and optimization(s) specified.
 +
 +Some  of these  optimizations can  co-exist easily  (break +  map, odd  +
 +sqrt()),  others are  partially  compatible  (map +  odd  can coexist  in
 +a  certain  form),  while  others  are  mutually  exclusive  (sqrt()  and
 +approximated  square  root  conflict).  So there  are  definitely  a  few
 +combinations that are possible using this scheme.
 +
 +====Program Specifications====
 +Your program should otherwise work like the discrete/cnv1 program did:
 +
 +  * obtain   parameters  from   the   command-line  (see   **command-line
 +    arguments** section below).
 +
 +    * check to make sure the  user indeed supplied enough parameters, and
 +      exit with an error message if not.
 +
 +    * argv[1]: maximum  quantity  of  primes to  calculate (your  program
 +      should run until it discovers **that** many primes).
 +
 +      * this value should be an integer value, greater than or equal to 0
 +
 +        * if argv[1] is  0, disable the quantity check, and  rely only on
 +          provided lower and upper bounds (thru argv[4] would be required
 +          in this case).
 +
 +    * argv[2]: N-ary specification; for this section, we are going to use
 +      **1**
 +
 +    * argv[3]:  **conditionally optional** lower bound  (starting value).
 +      Most of the time, this  will  probably be  **2**, but  should be  a
 +      positive integer greater than or equal to 2. This defines where the
 +      program will start its prime quantity check from.
 +
 +      * if omitted, assume a lower bound of **2**.
 +
 +      * if you desire  to specify an upper bound (argv[4]), you obviously
 +        MUST provide the lower bound argument under this scheme.
 +
 +    * argv[4]: **conditionally optional**  upper bound (ending value). If
 +      provided, this is the ending value you'd like to check to.
 +
 +      * If doing a quantity run (argv[1] is NOT 0), you don't need  this.
 +
 +      * If doing a quantity run AND you specify an upper bound, whichever
 +        condition is  achieved first  dictates program  termination. That
 +        is, upper bound could override quantity (if it is achieved before
 +        quantity), and  quantity can override  the upper bound (if  it is
 +        achieved before reaching the specified upper bound).
 +
 +    * argv[5]: specification of optimizations. Like in discrete/cnv1, the
 +      values are bitwise-specified in a single numeric value,  broken out
 +      as follows:
 +
 +        0  - no optimizations (naive, brute force approach)
 +        1  - break on composite
 +        2  - odds-only checking
 +        4  - map
 +        8  - sqrt
 +        16 - approximate square root
 +
 +    * for each argument:  you should do a basic check  to ensure the user
 +      complied  with this  specification, and  exit with  a unique  error
 +      message (displayed to STDERR) otherwise:
 +
 +      * for insufficient quantity  of arguments, display: **PROGRAM_NAME:
 +        insufficient number of arguments!**
 +
 +      * for invalid argv[1], display: **PROGRAM_NAME: invalid quantity!**
 +
 +      * for invalid argv[2], display: **PROGRAM_NAME: invalid value!**
 +
 +      * invalid  argv[3], display: **PROGRAM_NAME: invalid lower bound!**
 +
 +        * if argv[3] is not needed, ignore (no error displayed not forced
 +        exit, as it is acceptable defined behavior).
 +
 +      * invalid  argv[4], display: **PROGRAM_NAME: invalid upper bound!**
 +
 +        * if argv[4] is not needed, ignore (no error displayed nor forced
 +          exit, as it is acceptable defined behavior).
 +
 +        * In these  error messages, **PROGRAM_NAME**  is the name  of the
 +          program being run;  this can be accessed as a  string stored in
 +          **argv[0]**.
 +
 +  * perform  ONLY  the  optimization(s) specified  on  the  command-line.
 +    We  are producing  multiple  data points  for  a broader  performance
 +    comparison.
 +
 +  * please take note on differences in run-time, contemplating the impact
 +    the  algorithm  and  optimization(s)  have  on  performance  (timing,
 +    specifically).
 +
 +  * immediately  after  argument  processing: start  your stopwatch  (see
 +    **timing** section below).
 +
 +  * perform   the  correct  algorithm  and  optimization(s)  against  the
 +    command-line input(s) given.
 +
 +    * your program is  to have no fewer and no more than  1 loop  in this
 +      prime processing section.
 +
 +  * display identified primes (space-separated)  to a file pointer called
 +    **stdout**
 +
 +  * stop your stopwatch immediately following your prime processing loops
 +    (and terminating newline display to  **stdout**). Calculate  the time
 +    that has transpired (ending time minus starting time).
 +
 +  * output the processing run-time to the file pointer called **stderr**
 +
 +  * your  output   **MUST**  conform  to   the  example  output   in  the
 +    **execution** section below. This is also  a test to see how well you
 +    can implement to specifications. Basically:
 +
 +    * as  primes are  being  displayed, they  are space-separated  (first
 +      prime hugs the left margin), and  when all said and done, a newline
 +      is issued.
 +
 +====Coding Restrictions====
 +Since a lot of our explorations this semester were algorithmic in nature,
 +we made a lot  of use of restricting various approaches  we could take in
 +the implementation of our solutions.
 +
 +As   such,   the  following   restrictions   are   in  place   for   your
 +implementations:
 +
 +  * no global variables
 +  * no infinite loops
 +  * no if() shunts (have good conditions!)
 +    * what  is  the  difference  between   an  if()  shunt  and  a  valid
 +      conditional  statement? if()  shunts are  a messy  way of  avoiding
 +      iterations;  valid if()  statements  denote conditionally  optional
 +      processing that occurs every iteration.
 +  * one return statement (max) per function
 +  * exit() calls are limited to command-line or resource allocation error
 +    processing
 +  * avoid redundant sections of code (merge logic or break out functions)
 +
 +Remember, the focus  should be on writing elegant code,  and NOT on brute
 +forcing  some  solution.  Show  me that  you've  learned  something  this
 +semester- write clean, well commented, consistently indented code.
 +
 +=====Makefile operations=====
 +Makefiles provide a build automation system for our programs, instructing
 +the computer  on how  to compile  files, so we  don't have  to constantly
 +type  compiler  command-lines  ourselves.   I've  also  integration  some
 +other  useful,  value-added features  that  will  help you  with  overall
 +administration of the project.
 +
 +Basic  operation  of the  Makefile  is  invoked  by running  the  command
 +"**make**" by itself. The default action  is to compile everything in the
 +project directory.
 +
 +A description of some available commands include:
 +
 +  * **make**: compile everything
 +
 +    * any  **warnings** or **errors**  generated by the compiler  will go
 +      into a file  in  the  base  directory of  pnc0  in  a  file  called
 +      **errors**; you can **cat** it to view the information.
 +
 +  * **make debug**: compile everything with debug support
 +
 +    * any  **warnings** or **errors**  generated by the compiler  will be
 +      displayed to the screen as the programs compile.
 +
 +  * **make clean**: remove all binaries
 +
 +  * **make save**: make a backup of your current work
 +
 +=====Command-Line Arguments=====
 +To  automate our  comparisons,  we  will be  making  use of  command-line
 +arguments in our programs.
 +
 +====header files====
 +We don't need  any extra header files to use  command-line arguments, but
 +we will need  an additional header file to use  the **atoi(3)** function,
 +which  we'll use  to  quickly  turn the  command-line  parameter into  an
 +integer, and that  header file is **stdlib.h**, so be  sure to include it
 +with the others:
 +
 +<code c>
 +#include <stdio.h>
 +#include <stdlib.h>
 +</code>
 +
 +====setting up main()====
 +To accept (or rather, to gain  access) to arguments given to your program
 +at runtime,  we need to  specify two  parameters to the  main() function.
 +While  the names  don't matter,  the types  do.. I  like the  traditional
 +**argc**  and **argv**  names, although  it is  also common  to see  them
 +abbreviated as **ac** and **av**.
 +
 +Please declare your main() function as follows:
 +
 +<code c>
 +int main (int argc, char **argv)
 +</code>
 +
 +There  are two  very important  variables  involved here  (the types  are
 +actually  what  are important,  the  names  given  to the  variables  are
 +actually quite, variable;  you may see other references refer  to them as
 +things like "ac" and "av"):
 +
 +  * int argc: the count (an integer)  of tokens given on the command line
 +    (program name + arguments)
 +
 +  * <nowiki>char **argv</nowiki>:  an array  of strings  (technically  an
 +    array  of an  array of  char) that contains "strings" of the  various
 +    tokens provided on the command-line.
 +
 +The arguments are  accessible via the argv array, in  the order they were
 +specified:
 +
 +  * argv[0]: program invocation (path + program name)
 +  * argv[1]: our maximum / upper bound
 +  * argv[2]: should be provided as a 1 for this project
 +  * argv[3]: conditionally optional; represents lower bound
 +  * argv[4]: conditionally optional; represents upper bound
 +  * argv[5]: conditionally optional; represents optimizations, if any
 +
 +Additionally, let's not  forget the **argc** variable,  an integer, which
 +contains a  count of arguments (argc  == argument count). If  we provided
 +argv[0] through argv[4], argc would contain a 5.
 +
 +===example===
 +For example, if we were to execute the as follows:
 +
 +<cli>
 +lab46:~/src/discrete/eoce/0x3$ ./cnv3 128 1 2 2048 13
 +</cli>
 +
 +We'd have:
 +
 +  * <nowiki>argv[0]</nowiki>: "./cnv3" 
 +  * <nowiki>argv[1]</nowiki>: "128" (note, NOT integer 128, but a string) 
 +  * <nowiki>argv[2]</nowiki>: "1"
 +  * <nowiki>argv[3]</nowiki>: "2" 
 +  * <nowiki>argv[4]</nowiki>: "2048" 
 +  * <nowiki>argv[5]</nowiki>: "13" 
 +
 +and let's not forget:
 +
 +  * argc: 6   (there are 6 things, argv indexes 0, 1, 2, 3, 4, and 5)
 +
 +With the  conditionally optional arguments  as part of the  program spec,
 +for a valid execution of the program, argc could be a value anywhere from
 +3 to 6.
 +
 +====Simple argument checks====
 +While there are  a number of checks  we should perform, one  of the first
 +should be  a check  to see if  the minimal number  of arguments  has been
 +provided:
 +
 +<code c>
 +
 +    // if  less than 3 arguments (program_name + quantity + argv[2] == 3)
 +    // have been provided
 +    //
 +    if (argc < 3)
 +    {
 +        fprintf(stderr, "%s: insufficient number of arguments!\n", argv[0]);
 +        exit(1);
 +    }
 +</code>
 +
 +Since argv[3] (lower  bound) and argv[4] (upper  bound) are conditionally
 +optional, it wouldn't make sense to  check for them in the overall count.
 +But  we  can and  do  still  want  to  stategically utilize  **argc**  to
 +determine if an argv[3] or argv[4] is present.
 +
 +====Grab and convert max====
 +Finally, we  need to put  the argument representing the  maximum quantity
 +into a variable.
 +
 +I'd recommend declaring a variable of type **int**.
 +
 +We will use the **atoi(3)**  function to quickly convert the command-line
 +arguments into **int** values:
 +
 +<code c>
 +    max  = atoi (argv[1]);
 +</code>
 +
 +And now we can proceed with the rest of our prime implementation.
 +
 +=====Timing=====
 +Often  times,  when  checking  the  efficiency  of  a  solution,  a  good
 +measurement  (especially  for  comparison),  is  to  time  how  long  the
 +processing takes.
 +
 +In order to do  that in our prime number programs, we are  going to use C
 +library  functions  that  obtain  the  current time,  and  use  it  as  a
 +stopwatch: we'll grab the time  just before starting processing, and then
 +once more when  done. The total time will then  be the difference between
 +the two (end_time - start_time).
 +
 +We are going  to use the **gettimeofday(2)** function to  aid us in this,
 +and to use it, we'll need to do the following:
 +
 +====header file====
 +In order  to use the  **gettimeofday(2)** function in our  program, we'll
 +need to include the  **sys/time.h** header file, so be sure  to add it in
 +with the existing ones:
 +
 +<code c>
 +#include <stdio.h>
 +#include <stdlib.h>
 +#include <sys/time.h>
 +</code>
 +
 +====timeval variables====
 +**gettimeofday(2)** uses a  **struct timeval** data type,  of which we'll
 +need  to declare  two  variables in  our programs  (one  for storing  the
 +starting time, and the other for the ending time).
 +
 +Please declare these  with your other variables, up at  the top of main()
 +(but still WITHIN main()-- you do not need to declare global variables).
 +
 +<code c>
 +    struct timeval time_start; // starting time
 +    struct timeval time_end;   // ending time
 +</code>
 +
 +====Obtaining the time====
 +To use **gettimeofday(2)**,  we merely place it at the  point in our code
 +we wish to take the time.
 +
 +For  our prime  number  programs,  you'll want  to  grab  the start  time
 +**AFTER** you've  declared variables and processed  arguments, but **JUST
 +BEFORE** starting the driving loop doing the processing.
 +
 +That call will look something like this:
 +
 +<code c>
 +    gettimeofday(&time_start, 0);
 +</code>
 +
 +The ending  time should  be taken immediately  after all  processing (and
 +prime number output) is completed, and right before we display the timing
 +information to STDERR:
 +
 +<code c>
 +    gettimeofday(&time_end, 0);
 +</code>
 +
 +====Displaying the runtime====
 +Once we have  the starting and ending  times, we can display  this to the
 +**stderr** file pointer. You'll want this line:
 +
 +<code c>
 +fprintf(stderr, "%8.4lf\n",
 +time_end.tv_sec-time_start.tv_sec+((time_end.tv_usec-time_start.tv_usec)/1000000.0));
 +</code>
 +
 +For clarity  sake, that format specifier  is "%8.4lf", where the  "lf" is
 +"long  float", that  is **NOT**  a number  'one' but  a lowercase  letter
 +'ell'.
 +
 +And with  that, we can compute  an approximate run-time of  our programs.
 +The timing won't necessarily be accurate down to that level of precision,
 +but it will be informative enough for our purposes.
 +
 +=====Execution=====
 +
 +====specified quantity====
 +Your program output should be as follows (given the specified quantity):
 +
 +<cli>
 +lab46:~/src/discrete/eoce/0x3$ ./cnv3 24 1 # same as 24 1 0 0 0
 +2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 
 +  0.0001
 +lab46:~/src/discrete/eoce/0x3$ 
 +</cli>
 +
 +The execution of  the programs is short and simple-  grab the parameters,
 +do the processing, produce the output, and then terminate.
 +
 +====invalid lower bound====
 +Here's an  example that should generate  an error upon running  (based on
 +project specifications):
 +
 +<cli>
 +lab46:~/src/discrete/eoce/0x3$ ./cnv3 32 1 0
 +./cnv3: invalid lower bound
 +lab46:~/src/discrete/eoce/0x3$ 
 +</cli>
 +
 +In this case, the program logic should have detected an invalid condition
 +and bailed  out before prime computations  even began. No timing  data is
 +displayed, because exiting should occur even prior to that.
 +
 +====upper bound overriding quantity====
 +As indicated above, there is  potential interplay with an active quantity
 +and upper  bound values. Here is  an example where upper  bound overrides
 +quantity, resulting in an early termination (ie upper bound is hit before
 +quantity):
 +
 +<cli>
 +lab46:~/src/discrete/eoce/0x3$ ./cnv3 128 1 7 23 10
 +7 11 13 17 19 23
 +  0.0001
 +lab46:~/src/discrete/eoce/0x3$ 
 +</cli>
 +
 +Also for  fun, I  set the  lower bound  to 7,  so you'll  see computation
 +starts at 7 (vs. the usual 2).
 +
 +====In general====
 +Analyze  the  times  you  see...  do they  make  sense,  especially  when
 +comparing the algorithm used and  the quantity being processed? These are
 +related  to  some very  important  core  Computer Science  considerations
 +we  need  to  be  increasingly  mindful of  as  we  design  our  programs
 +and  implement  our  solutions. Algorithmic  complexity  and  algorithmic
 +efficiency will be common themes in all we do.
 +
 +=====Submission=====
 +To successfully  complete this  project, the  following criteria  must be
 +met:
 +
 +  * Code must compile cleanly (no warnings or errors)
 +  * Output must be correct, and match that given in sample output above.
 +  * Code must be nicely and consistently indented
 +  * Code must utilize the algorithm and optimizations presented above.
 +  * Code must be commented
 +  * Output Formatting (including spacing)  of program must conform to the
 +    provided output (see above).
 +
 +  * Track/version the source code in your lab46 SEMESTER repository
 +  * Submit a copy of your source code to me using the **submit** tool.
 +
 +
playground/playground.txt · Last modified: 2023/09/12 19:29 by rford5