Differences

This shows you the differences between two versions of the page.

--- playground:playground [2021/11/02 21:20] – pleblanc
+++ playground:playground [2023/09/12 19:29] – [Objective] rford5
@@ Line 1: / Line 1: @@
-====== PlayGround ======
+discrete fall2022 eoce 0x3
+========================
+=====Objective=====
+To explore  the cnv0/cnv1  programs from  a different  perspective: using
+only ONE loop to drive central processing.
+<WRAP center round box 60%>
+messing with this
+</WRAP>
+<blockquote>messing some more with this</blockquote>
+=====Background=====
+In  mathematics, a  **prime**  number  is a  value  that  is only  evenly
+divisible  by  1 and  itself;  it  has just  that  one  pair of  factors,
+no  others.  Numbers that  have  divisibility/factors  are classified  as
+**composite** numbers.
+The number **6** is a **composite** number, as in addition to 1 and 6, it
+also has the factors of 2 and 3.
+The number  **17**, however, is a  **prime** number, as no  numbers other
+than 1 and 17 can be evenly divided into it.
+=====Calculating the primality of a number=====
+As of yet, there is no quick  and direct way of determining the primality
+of  a  given number.  Instead,  we  must perform  a  series  of tests  to
+determine if it fails primality (typically by proving it is composite).
+This process incurs  a considerable amount of processing  overhead on the
+task,  so much  so  that increasingly  large  values take  ever-expanding
+amounts of  time. Often, approaches  to prime number  calculation involve
+various algorithms, which offer various benefits (less time) and drawback
+(more complex code).
+Your task for  this project is to implement a  prime number program using
+the straightforward, unoptimized  brute-force algorithm, which determines
+the primality of a number in a "trial by division" approach.
+=====Main algorithm: brute force=====
+The brute force  approach is the simplest to implement  (although at some
+cost in the form of time).
+As  we  will  be  looking   to  do  some  time/performance  analysis  and
+comparisons, it  is often good to  have a baseline. This  program will be
+it.
+To perform the process of computing  the primality of a number, we simply
+attempt to evenly divide  all the values between 2 and  one less than the
+number in  question. If  any one  of them divides  evenly, the  number is
+**NOT** prime, but instead composite.
+Checking  the **remainder**  of a  division  indicates whether  or not  a
+division was clean (having 0 remainder indicates such a state).
+For example, the number 11:
-D<sub>o</sub>//
- <sub>o</sub>
 <code>
-class Foo
+% 2 = 1 (2 is not a factor of 11)
+% 3 = 2 (3 is not a factor of 11)
+% 4 = 3 (4 is not a factor of 11)
+% 5 = 1 (5 is not a factor of 11)
+% 6 = 5 (6 is not a factor of 11)
+% 7 = 4 (7 is not a factor of 11)
+% 8 = 3 (8 is not a factor of 11)
+% 9 = 2 (9 is not a factor of 11)
+% 10 = 1 (10 is not a factor of 11)
 </code>
+Because none  of the values  2-10 evenly divided into  11, we can  say it
+passed the test: **11 is a prime number**
+On the other hand, take 119:
+<code>
+% 2 = 1 (2 is not a factor of 119)
+% 3 = 2 (3 is not a factor of 119)
+% 4 = 3 (4 is not a factor of 119)
+% 5 = 4 (5 is not a factor of 119)
+% 6 = 5 (6 is not a factor of 119)
+% 7 = 0 (7 is a factor of 119)
+% 8 = 7
+% 9 = 2
+% 10 = 9
+% 11 = 9
+% 12 = 11
+% 13 = 2
+...
+</code>
+Because, during our range of testing every value from 2-118, we find that
+evenly divides into 119, it failed  the test: 119 is **not** prime, but
+is instead a composite number.
+Please NOTE: Even once a number  is identified as composite, your program
+MUST **CONTINUE** evaluating the remainder of the values (up to 119-1, or
+). It might seem pointless (and it is for a production program), but I
+want you to see the performance implications this creates.
+====algorithm====
+Some things to keep in mind on your implementation:
+  * you will want to have exactly ONE  loop in the central processing for
+    this program.
+    * you will need to flatten  the conditions that previously took place
+      in the nested loop
+    * be mindful of what the underlying conditions are
+  * you  know the starting  value and  the terminating condition,  so you
+    have a clear starting and ending point to work with.
+  * I do **NOT** want to see ambiguous, one-letter variables used in your
+    implementation(s). Please use //meaningful// variable names.
+    * Some good examples of variable names would be:
+      * **number**: the number being tested
+      * **factor**: the  value being  divided  into number  to  test  for
+        primality
+      * **step**: the rate by which some variable is changing
+      * **qty**: the count of the current tally of primes
+      * **max**: the maximum count we seek
+      * **start**: a value we are starting at
+      * **lower**: a lower bound
+      * **upper**: an upper bound
+      * see how  much more readable and meaningful  these are, especially
+        as compared  to **a**, **i**, **n**, **x**?  You may even find it
+        helps with debugging and understanding your code better.
+  * let  the  loop drive  the overall  process. Identify  prime/composite
+    status separate from loop terminating conditions.
+    * and remember,  the baseline brute force algorithm may well identify
+      a value as composite, but won't terminate the loop.
+  * your  timing  should  start  before  the  loop (just  AFTER  argument
+    processing),  and  terminate  immediately  following  the terminating
+    output newline outside the loops.
+  * you  may **NOT** split  **qty** and **range** functionality  into two
+    separate  code  blocks (ie have two sets  of two loops). Only the one
+    set as indicated.
+=====prime algorithm optimizations=====
+To  give  us  a  decent  appreciation  of  the  subtleties  of  algorithm
+development  in a  theme of  programs,  I have  identified the  following
+optimizations that we will be implementing.
+The  optimizations we  will be  implementing in  this section include:
+  * **break  on   composite**  - once  a  tested  number  is proven to be
+    composite, there is no need to continue processing: break out  of the
+    factor check logic and proceed to the next number
+  * **mapping factors of 6** - it  turns out that, aside from the initial
+    primes of **2** and **3**, that **all** prime numbers fall to a +1 or
+    -1 off a factor of six  (there is an algorithm for this: **6a+/-1**).
+    This optimization  will utilize  this property, only  testing numbers
+    +/-1 off of factors of 6 (how might this impact overall processing?)
+  * **odds-only  checking**  -  aside  from **2**,  **all**  other  prime
+    numbers are odd. Therefore, there is zero need to perform a composite
+    check on  an even  number, allowing  us to  focus exclusively  on odd
+    values (luckily, they seem to occur in a predictable pattern).
+  * **sqrt() trick** - mathematically it has  been shown that if a number
+    has any  evenly divisible factors, at  least one half of  that factor
+    pair will occur by the square root point of the number being tested.
+  * **sqrt()-less square  root approximation**  - **sqrt()**,  a function
+    in  the  math  library,  does  an  industrial  strength  square  root
+    calculation.  We  don't  need  that, merely  a  whole  integer  value
+    corresponding to the approximate square  root. Here we will implement
+    our  own  logic   to  approximate  square  root,   hopefully  with  a
+    considerable performance impact.
+Unless  specified  in  the  encoded  name,  your  algorithm  should  only
+implement the algorithm and optimization(s) specified.
+Some  of these  optimizations can  co-exist easily  (break +  map, odd  +
+sqrt()),  others are  partially  compatible  (map +  odd  can coexist  in
+a  certain  form),  while  others  are  mutually  exclusive  (sqrt()  and
+approximated  square  root  conflict).  So there  are  definitely  a  few
+combinations that are possible using this scheme.
+====Program Specifications====
+Your program should otherwise work like the discrete/cnv1 program did:
+  * obtain   parameters  from   the   command-line  (see   **command-line
+    arguments** section below).
+    * check to make sure the  user indeed supplied enough parameters, and
+      exit with an error message if not.
+    * argv[1]: maximum  quantity  of  primes to  calculate (your  program
+      should run until it discovers **that** many primes).
+      * this value should be an integer value, greater than or equal to 0
+        * if argv[1] is  0, disable the quantity check, and  rely only on
+          provided lower and upper bounds (thru argv[4] would be required
+          in this case).
+    * argv[2]: N-ary specification; for this section, we are going to use
+      **1**
+    * argv[3]:  **conditionally optional** lower bound  (starting value).
+      Most of the time, this  will  probably be  **2**, but  should be  a
+      positive integer greater than or equal to 2. This defines where the
+      program will start its prime quantity check from.
+      * if omitted, assume a lower bound of **2**.
+      * if you desire  to specify an upper bound (argv[4]), you obviously
+        MUST provide the lower bound argument under this scheme.
+    * argv[4]: **conditionally optional**  upper bound (ending value). If
+      provided, this is the ending value you'd like to check to.
+      * If doing a quantity run (argv[1] is NOT 0), you don't need  this.
+      * If doing a quantity run AND you specify an upper bound, whichever
+        condition is  achieved first  dictates program  termination. That
+        is, upper bound could override quantity (if it is achieved before
+        quantity), and  quantity can override  the upper bound (if  it is
+        achieved before reaching the specified upper bound).
+    * argv[5]: specification of optimizations. Like in discrete/cnv1, the
+      values are bitwise-specified in a single numeric value,  broken out
+      as follows:
+  - no optimizations (naive, brute force approach)
+  - break on composite
+  - odds-only checking
+  - map
+  - sqrt
+- approximate square root
+    * for each argument:  you should do a basic check  to ensure the user
+      complied  with this  specification, and  exit with  a unique  error
+      message (displayed to STDERR) otherwise:
+      * for insufficient quantity  of arguments, display: **PROGRAM_NAME:
+        insufficient number of arguments!**
+      * for invalid argv[1], display: **PROGRAM_NAME: invalid quantity!**
+      * for invalid argv[2], display: **PROGRAM_NAME: invalid value!**
+      * invalid  argv[3], display: **PROGRAM_NAME: invalid lower bound!**
+        * if argv[3] is not needed, ignore (no error displayed not forced
+        exit, as it is acceptable defined behavior).
+      * invalid  argv[4], display: **PROGRAM_NAME: invalid upper bound!**
+        * if argv[4] is not needed, ignore (no error displayed nor forced
+          exit, as it is acceptable defined behavior).
+        * In these  error messages, **PROGRAM_NAME**  is the name  of the
+          program being run;  this can be accessed as a  string stored in
+          **argv[0]**.
+  * perform  ONLY  the  optimization(s) specified  on  the  command-line.
+    We  are producing  multiple  data points  for  a broader  performance
+    comparison.
+  * please take note on differences in run-time, contemplating the impact
+    the  algorithm  and  optimization(s)  have  on  performance  (timing,
+    specifically).
+  * immediately  after  argument  processing: start  your stopwatch  (see
+    **timing** section below).
+  * perform   the  correct  algorithm  and  optimization(s)  against  the
+    command-line input(s) given.
+    * your program is  to have no fewer and no more than  1 loop  in this
+      prime processing section.
+  * display identified primes (space-separated)  to a file pointer called
+    **stdout**
+  * stop your stopwatch immediately following your prime processing loops
+    (and terminating newline display to  **stdout**). Calculate  the time
+    that has transpired (ending time minus starting time).
+  * output the processing run-time to the file pointer called **stderr**
+  * your  output   **MUST**  conform  to   the  example  output   in  the
+    **execution** section below. This is also  a test to see how well you
+    can implement to specifications. Basically:
+    * as  primes are  being  displayed, they  are space-separated  (first
+      prime hugs the left margin), and  when all said and done, a newline
+      is issued.
+====Coding Restrictions====
+Since a lot of our explorations this semester were algorithmic in nature,
+we made a lot  of use of restricting various approaches  we could take in
+the implementation of our solutions.
+As   such,   the  following   restrictions   are   in  place   for   your
+implementations:
+  * no global variables
+  * no infinite loops
+  * no if() shunts (have good conditions!)
+    * what  is  the  difference  between   an  if()  shunt  and  a  valid
+      conditional  statement? if()  shunts are  a messy  way of  avoiding
+      iterations;  valid if()  statements  denote conditionally  optional
+      processing that occurs every iteration.
+  * one return statement (max) per function
+  * exit() calls are limited to command-line or resource allocation error
+    processing
+  * avoid redundant sections of code (merge logic or break out functions)
+Remember, the focus  should be on writing elegant code,  and NOT on brute
+forcing  some  solution.  Show  me that  you've  learned  something  this
+semester- write clean, well commented, consistently indented code.
+=====Makefile operations=====
+Makefiles provide a build automation system for our programs, instructing
+the computer  on how  to compile  files, so we  don't have  to constantly
+type  compiler  command-lines  ourselves.   I've  also  integration  some
+other  useful,  value-added features  that  will  help you  with  overall
+administration of the project.
+Basic  operation  of the  Makefile  is  invoked  by running  the  command
+"**make**" by itself. The default action  is to compile everything in the
+project directory.
+A description of some available commands include:
+  * **make**: compile everything
+    * any  **warnings** or **errors**  generated by the compiler  will go
+      into a file  in  the  base  directory of  pnc0  in  a  file  called
+      **errors**; you can **cat** it to view the information.
+  * **make debug**: compile everything with debug support
+    * any  **warnings** or **errors**  generated by the compiler  will be
+      displayed to the screen as the programs compile.
+  * **make clean**: remove all binaries
+  * **make save**: make a backup of your current work
+=====Command-Line Arguments=====
+To  automate our  comparisons,  we  will be  making  use of  command-line
+arguments in our programs.
+====header files====
+We don't need  any extra header files to use  command-line arguments, but
+we will need  an additional header file to use  the **atoi(3)** function,
+which  we'll use  to  quickly  turn the  command-line  parameter into  an
+integer, and that  header file is **stdlib.h**, so be  sure to include it
+with the others:
+<code c>
+#include <stdio.h>
+#include <stdlib.h>
+</code>
+====setting up main()====
+To accept (or rather, to gain  access) to arguments given to your program
+at runtime,  we need to  specify two  parameters to the  main() function.
+While  the names  don't matter,  the types  do.. I  like the  traditional
+**argc**  and **argv**  names, although  it is  also common  to see  them
+abbreviated as **ac** and **av**.
+Please declare your main() function as follows:
+<code c>
+int main (int argc, char **argv)
+</code>
+There  are two  very important  variables  involved here  (the types  are
+actually  what  are important,  the  names  given  to the  variables  are
+actually quite, variable;  you may see other references refer  to them as
+things like "ac" and "av"):
+  * int argc: the count (an integer)  of tokens given on the command line
+    (program name + arguments)
+  * <nowiki>char **argv</nowiki>:  an array  of strings  (technically  an
+    array  of an  array of  char) that contains "strings" of the  various
+    tokens provided on the command-line.
+The arguments are  accessible via the argv array, in  the order they were
+specified:
+  * argv[0]: program invocation (path + program name)
+  * argv[1]: our maximum / upper bound
+  * argv[2]: should be provided as a 1 for this project
+  * argv[3]: conditionally optional; represents lower bound
+  * argv[4]: conditionally optional; represents upper bound
+  * argv[5]: conditionally optional; represents optimizations, if any
+Additionally, let's not  forget the **argc** variable,  an integer, which
+contains a  count of arguments (argc  == argument count). If  we provided
+argv[0] through argv[4], argc would contain a 5.
+===example===
+For example, if we were to execute the as follows:
+<cli>
+lab46:~/src/discrete/eoce/0x3$ ./cnv3 128 1 2 2048 13
+</cli>
+We'd have:
+  * <nowiki>argv[0]</nowiki>: "./cnv3"
+  * <nowiki>argv[1]</nowiki>: "128" (note, NOT integer 128, but a string)
+  * <nowiki>argv[2]</nowiki>: "1"
+  * <nowiki>argv[3]</nowiki>: "2"
+  * <nowiki>argv[4]</nowiki>: "2048"
+  * <nowiki>argv[5]</nowiki>: "13"
+and let's not forget:
+  * argc: 6   (there are 6 things, argv indexes 0, 1, 2, 3, 4, and 5)
+With the  conditionally optional arguments  as part of the  program spec,
+for a valid execution of the program, argc could be a value anywhere from
+to 6.
+====Simple argument checks====
+While there are  a number of checks  we should perform, one  of the first
+should be  a check  to see if  the minimal number  of arguments  has been
+provided:
+<code c>
+    // if  less than 3 arguments (program_name + quantity + argv[2] == 3)
+    // have been provided
+    //
+    if (argc < 3)
+    {
+        fprintf(stderr, "%s: insufficient number of arguments!\n", argv[0]);
+        exit(1);
+    }
+</code>
+Since argv[3] (lower  bound) and argv[4] (upper  bound) are conditionally
+optional, it wouldn't make sense to  check for them in the overall count.
+But  we  can and  do  still  want  to  stategically utilize  **argc**  to
+determine if an argv[3] or argv[4] is present.
+====Grab and convert max====
+Finally, we  need to put  the argument representing the  maximum quantity
+into a variable.
+I'd recommend declaring a variable of type **int**.
+We will use the **atoi(3)**  function to quickly convert the command-line
+arguments into **int** values:
+<code c>
+    max  = atoi (argv[1]);
+</code>
+And now we can proceed with the rest of our prime implementation.
+=====Timing=====
+Often  times,  when  checking  the  efficiency  of  a  solution,  a  good
+measurement  (especially  for  comparison),  is  to  time  how  long  the
+processing takes.
+In order to do  that in our prime number programs, we are  going to use C
+library  functions  that  obtain  the  current time,  and  use  it  as  a
+stopwatch: we'll grab the time  just before starting processing, and then
+once more when  done. The total time will then  be the difference between
+the two (end_time - start_time).
+We are going  to use the **gettimeofday(2)** function to  aid us in this,
+and to use it, we'll need to do the following:
+====header file====
+In order  to use the  **gettimeofday(2)** function in our  program, we'll
+need to include the  **sys/time.h** header file, so be sure  to add it in
+with the existing ones:
+<code c>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/time.h>
+</code>
+====timeval variables====
+**gettimeofday(2)** uses a  **struct timeval** data type,  of which we'll
+need  to declare  two  variables in  our programs  (one  for storing  the
+starting time, and the other for the ending time).
+Please declare these  with your other variables, up at  the top of main()
+(but still WITHIN main()-- you do not need to declare global variables).
+<code c>
+    struct timeval time_start; // starting time
+    struct timeval time_end;   // ending time
+</code>
+====Obtaining the time====
+To use **gettimeofday(2)**,  we merely place it at the  point in our code
+we wish to take the time.
+For  our prime  number  programs,  you'll want  to  grab  the start  time
+**AFTER** you've  declared variables and processed  arguments, but **JUST
+BEFORE** starting the driving loop doing the processing.
+That call will look something like this:
+<code c>
+    gettimeofday(&time_start, 0);
+</code>
+The ending  time should  be taken immediately  after all  processing (and
+prime number output) is completed, and right before we display the timing
+information to STDERR:
+<code c>
+    gettimeofday(&time_end, 0);
+</code>
+====Displaying the runtime====
+Once we have  the starting and ending  times, we can display  this to the
+**stderr** file pointer. You'll want this line:
+<code c>
+fprintf(stderr, "%8.4lf\n",
+time_end.tv_sec-time_start.tv_sec+((time_end.tv_usec-time_start.tv_usec)/1000000.0));
+</code>
+For clarity  sake, that format specifier  is "%8.4lf", where the  "lf" is
+"long  float", that  is **NOT**  a number  'one' but  a lowercase  letter
+'ell'.
+And with  that, we can compute  an approximate run-time of  our programs.
+The timing won't necessarily be accurate down to that level of precision,
+but it will be informative enough for our purposes.
+=====Execution=====
+====specified quantity====
+Your program output should be as follows (given the specified quantity):
+<cli>
+lab46:~/src/discrete/eoce/0x3$ ./cnv3 24 1 # same as 24 1 0 0 0
+3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89
+.0001
+lab46:~/src/discrete/eoce/0x3$
+</cli>
+The execution of  the programs is short and simple-  grab the parameters,
+do the processing, produce the output, and then terminate.
+====invalid lower bound====
+Here's an  example that should generate  an error upon running  (based on
+project specifications):
+<cli>
+lab46:~/src/discrete/eoce/0x3$ ./cnv3 32 1 0
+./cnv3: invalid lower bound
+lab46:~/src/discrete/eoce/0x3$
+</cli>
+In this case, the program logic should have detected an invalid condition
+and bailed  out before prime computations  even began. No timing  data is
+displayed, because exiting should occur even prior to that.
+====upper bound overriding quantity====
+As indicated above, there is  potential interplay with an active quantity
+and upper  bound values. Here is  an example where upper  bound overrides
+quantity, resulting in an early termination (ie upper bound is hit before
+quantity):
+<cli>
+lab46:~/src/discrete/eoce/0x3$ ./cnv3 128 1 7 23 10
+11 13 17 19 23
+.0001
+lab46:~/src/discrete/eoce/0x3$
+</cli>
+Also for  fun, I  set the  lower bound  to 7,  so you'll  see computation
+starts at 7 (vs. the usual 2).
+====In general====
+Analyze  the  times  you  see...  do they  make  sense,  especially  when
+comparing the algorithm used and  the quantity being processed? These are
+related  to  some very  important  core  Computer Science  considerations
+we  need  to  be  increasingly  mindful of  as  we  design  our  programs
+and  implement  our  solutions. Algorithmic  complexity  and  algorithmic
+efficiency will be common themes in all we do.
+=====Submission=====
+To successfully  complete this  project, the  following criteria  must be
+met:
+  * Code must compile cleanly (no warnings or errors)
+  * Output must be correct, and match that given in sample output above.
+  * Code must be nicely and consistently indented
+  * Code must utilize the algorithm and optimizations presented above.
+  * Code must be commented
+  * Output Formatting (including spacing)  of program must conform to the
+    provided output (see above).
+  * Track/version the source code in your lab46 SEMESTER repository
+  * Submit a copy of your source code to me using the **submit** tool.

Lab46 Wiki

User Tools

Site Tools

Differences

Page Tools