Corning Community College
CSCS1730 UNIX/Linux Fundamentals
======Project: SCRIPTING PI FUN (spf0)======
=====Errata=====
* any bugfixes or project updates will be posted here
=====Toolbox=====
In addition to the tools you're already familiar with, it is recommended you check out the following tools for possible application in this project (you may not need to use them in your solution, but they definitely offer up functionality that some solutions can make use of):
* **basename(1)**
* **bc(1)**
* **cut(1)**
* **diff(1)**
* **grep(1)**
* **mktemp(1)**
* **sed(1)**
* **tr(1)**
* **wc(1)**
=====Objective=====
To create two scripts that perform some operation in the domain of calculating digits of PI, and searching its digits for patterns of substrings.
=====Background=====
[[https://en.wikipedia.org/wiki/Pi|PI]] is an important mathematical constant. It is used in various applications, and we often find ourselves making use of it.
But what about //generating// it? That's part of what this project seeks to have us explore.
We will also, once successfully generating pi out to a variable amount of digits, write a script to allow us to search those digits for numeric substring patterns.
====Calculating PI====
There are various methods for calculating PI, which take varying amounts of processing power and time to perform, and have different levels of accuracy.
We will want to come up with a process that is accurate, yet perhaps also relatively simple to express.
You may want to check out the method using **arctan** to accomplish this:
lab46:~/src/unix/spf0$ echo "4*a(1)" | bc -lq
3.14159265358979323844
Here we see that the default precision of **bc(1)** is to //scale// to around 20 digits.
With that, if we check a database of PI digits (which I have conveniently stored on lab46, in the **/usr/local/etc/pi.1000000** file), we can compare our results against the verified, correct results (note that the database omits the period '.' separating the 3 from the 14..., be sure to compensate for this):
lab46:~/src/unix/spf0$ cat /usr/local/etc/pi.1000000 | cut -c1-21
314159265358979323846
It would seem that all but the last value is correct. In which case, we'd want to strip that off and retain the valid component in our explorations.
=====Process=====
It is your task to write 2 scripts; one will be responsible for generating a file containing calculated digits of PI (and you must perform the calculations... no skimping on this), and the other will use that produced data to search through the digits of PI in order to find occurrences of patterns. Further details follow:
====script1: pigen====
You are to write a script, called and submitted by the name **pigen** (no extension, although it is to be a bash script, with proper shabang, comments, and solution).
By default, if you run the script by itself, it should calculate and produce (in a file in the current directory), a file by the name of **pi.100.out**, which will contain the first 100 digits of PI (lacking the period '.' that separates the 3 from the rest of the irrational number):
lab46:~/src/unix/spf0$ ./pigen
lab46:~/src/unix/spf0$ ls
pi.100.out pigen
lab46:~/src/unix/spf0$ cat pi.100.out
3141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067
lab46:~/src/unix.spf0$
If you provide a command-line argument, it must be a valid numeric value between 1 and 2000, and will calculate that many digits of PI, saving it in a file **pi.#.out**, where # is the number of digits requested.
You also need to verify your results against the PI digit database on lab46 (do not copy this file... set a variable to it for easy access), and display the message "MISMATCH" if your calculated/processed results do NOT match the validated PI computation out to that indicated quantity of spaces.
Additionally:
* Your script is to only produce a **pi.#.out** file in the current directory.
* ANY temporary files need to be created in (and subsequently removed from) the **/tmp** directory, of a relatively unique name autogenerated by the **mktemp(1)** tool.
====script2: pigrep====
Your second script will be called **pigrep** (no extension, although it is to be a bash script, with proper shabang, comments, and solution), and will perform searches on a **pigen**-generated **pi.#.out** file for specified patterns (numerical substrings) within the generated digits.
By default, if you run the script by itself, it should generate the following error (and exit with a non-zero value):
lab46:~/src/unix/spf0$ ./pigrep
ERROR: must specify PATTERN
lab46:~/src/unix/spf0$
It will have online usage information, displayed when the 'help' option is provided anywhere on the **pigrep** command-line:
lab46:~/src/unix/spf0$ ./pigrep help
pigrep - search available pi digits via regex for matches;
must be part of pipeline (send PI digits in via STDIN)
usage: pigrep [OPTION...] PATTERN
note: if MAX variable is set, cap processing at that value
options:
atend - calculations are based on last digit of match
byline - output one value per line (default is space-separated)
drop3 - do not include leading 3 of pi (*3*14) in processing
offset - determine offsets of matches (from start of digits)
help - display this help and exit
lab46:~/src/unix/spf0$
Your script should also take the following arguments (in this order):
* required argument:
* **PATTERN**: a string containing the numeric pattern you are looking for.
* for example: '76'
* optional arguments:
* **atend**: by default, offset calculations are based on the first (left-most) digit of the pattern; with this option, compute the offset based on the last (right-most) digit of the pattern
* **atend** is meant to be used in conjunction with **offset**, by itself it does not alter default processing
* **byline**: by default, matches are space-separated. With this option, display one match per line
* **drop3**: do not include the leading 3 of PI in processing
* **offset**: determine offsets of matches from start of digits
* **help**: display usage information
You are also to check for the existence of a MAX variable, and if set to a valid, positive non-zero decimal (base 10) number, will cap the amount of results it processes / outputs.
Numerical arguments are to be given as valid, positive non-zero decimal (base 10) values.
With any of these arguments validly provided, they should adjust the script's processing and output accordingly.
Also to keep in mind:
* Your script is not to produce any files as a result of operating.
* If your solution calls for ANY temporary files, they need to be created in (and subsequently removed from) the **/tmp** directory, and be of a relatively unique name autogenerated by the **mktemp(1)** tool.
Some sample outputs follow:
====Only specifying numeric pattern====
In the event only a pattern is provided, search through the PI data for the provided pattern, displaying each match to STDOUT.
For example:
lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep '26'
26 26
lab46:~/src/unix/spf0$
====one result per line====
Using the **byline** option, instead of displaying results horizonally, they'll be displayed vertically (one result per line) to STDOUT.
lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep '26' byline
26
26
lab46:~/src/unix/spf0$
====compute offsets from start of digits====
The **offset** argument will, instead of displaying the numeric matches (which, aside from counting are of dubious value) will display how many digits away from the start of the PI digits the pattern resides.
lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep '26' offset
7 22
lab46:~/src/unix/spf0$
With an example like this, we should easily be able to verify its correctness:
lab46:~/src/unix/spf0$ cat pi.120.out
314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664
lab46:~/src/unix/spf0$ cat pi.120.out | cut -c7-8,22-23
2626
lab46:~/src/unix/spf0$
====mixing options (offset, byline)====
And we can mix options (in any order):
lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep byline '26' offset
7
22
lab46:~/src/unix/spf0$
====Using atend with offset====
The **atend** option, issued by itself, has no impact on operations:
lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep atend '26'
26 26
lab46:~/src/unix/spf0$
So we need to combine it with the **offset** option to make an impact:
lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep atend offset '26'
8 23
lab46:~/src/unix/spf0$
====dropping the leading 3====
With the **drop3** option, we merely exclude the leading 3 of pi from our calculations. This should result in everything being "off by one" from previous outputs (with any combination of arguments):
lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep atend offset '26' drop3
7 22
lab46:~/src/unix/spf0$
====Using MAX to limit processing====
Sometimes, a pattern may produce an untenable quantity of results. We may wish to restrict it, and can do so with the aid of setting the **MAX** variable.
Take this request, which produces numerous results:
lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep '2'
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
lab46:~/src/unix/spf0$
We can cut that down to a smaller quantity by setting MAX accordingly (let's say, limit to 4 matches):
lab46:~/src/unix/spf0$ cat pi.120.out | MAX=4 ./pigrep '2'
2 2 2 2
lab46:~/src/unix/spf0$
You could also **export** MAX, mitigating the need to manually specify it for each run.
=====Specifications=====
Evaluation will be based on correctness of values as well as on formatting/spacing.
You'll notice that everything lines up and is positioned similarly:
* groupwork! This project is specifically designed to be performed in groups.
* I want to see groups ranging from a minimum of 2 people, to a maximum of 4 people. Any less and any more are not allowed.
* furthermore, groups of 2 or 3 cannot have overall class ranks (rankings as of today: 03/14/2018) within 3 ranks of each other (ie ranks 8 and 9, or 8 and 10, or 8 and 11 are not valid combinations for 2-person groups).
* ranks 1 and 2 are specifically and uniquely exempt from this restriction (ie they can be part of the same group, if they were to choose to be a group of 2). Otherwise, same rank spacing still applies.
* groups of 4 cannot have rank subsets that span 3 consecutive ranks (ie ranks 6, 7, 8, and 11)
* EVERY group member needs to be identified in your scripts (in a comment listing everyone).
* EVERY group member needs to be familiar with the end products
* EVERY group member needs to submit a copy of the scripts (they should all be identical- I will check)
* in the event you feel other group members have not lived up to their obligations, you may alter your scripts accordingly (be it not giving them the final product, or enhancing functionality to bring it in compliance, or leaving comments indicating who wasn't pulling their own weight).
* EVERY group member needs to do an approximately equal portion of the work. Slackers should be called out (I'll be keeping an eye out as well) and they will lose credit.
* This project is designed to be done DURING class time. It may offer up some useful insights into other projects (that you're doing on your own).
* You need to check your arguments to ensure they are present and valid.
* all mentioned arguments implementing their indicated functionality (atend, byline, drop3, offset, help)
* help takes priority over other options
* atend has no appreciable impact unless offset is also specified
* any invalid arguments (nonsensical value in place of starting value, invalid base specification, etc.) should be silently dropped/ignored.
* if lacking any bases to display, silently exit
* your script needs to commence with a proper **shabang** to run using bash; your script needs to end with an "**exit 0**" at the very end
* comments and indentation are required and necessary
* comments should explain how or why you are doing something
* indentation should be consistent throughout the script (no mixing of different indentation units; no mixing of tabs and spaces)
* indentation is to be no less than 3 on-screen spaces (I recommend tabstops of 4).
* continuing with our shell scripting, your scripts will need to employ in a core/central way (note that both scripts may not each need all of these, but across both scripts, you should make sure that each of these concepts is utilized):
* variables
* command-line argument parsing and usage
* command-line pipelines
* command expansions
* regular expressions
* conditional/selection structures
* loops
* your logic needs to:
* flow (one thing leads into the next, as best as possible)
* make sense within the given context
* avoid redundancy
* be understood by you, and everyone in your group (no grabbing snippets that seem to "work" from the internet)
* if you gain inspiration from some external resource, please cite it
* comments are a great way of demonstrating understanding (if you explain the why and how effectively, and it isn't in violation of other aspects, I'll know you are in control of things)
To be sure, I'll be checking to make sure you solution follows the spirit of what this project is about (that you implement functional, flowing logic utilizing the tools and concepts we've learned, in an application that helps demonstrate your comprehension). Don't try to weasel your way out of this or cut corners. This is an opportunity to further solidify your proficiency with everything.
=====Spirit of project=====
The spirit of the project embodies many aspects we've been focusing on throughout the semester:
* recognizing patterns to employ effective solutions in problem solving
* utilizing concepts and tools covered
* demonstrating comprehension of concepts, tools, and problems
* employing concepts in knowledgeable and thoughtful manner
* following instructions
* implementing to specifications
* utilizing creativity
* being able to control solution via consistent, clear, and organized presentation
* approximately equal involvement and impact of all group members on the brainstorming and development of the finished scripts.
Basically: I want your solution to be the result of an honest, genuine brainstorming process where you have figured out a path to solving the problem, you have dabbled and experimented and figured things out, and you can command the concepts and tools with a fluency enabling you to pull off such a feat. Your solution should demonstrate the real learning that took place and experience gained.
Cutting corners, avoiding work, skimping on functionality, cheating through getting others to do work for you or finding pre-packaged "answers" on the internet violates the spirit of the project, for they betray your ability to pull off the task on your own.
=====Identifying shortcomings=====
I would also like it if you provided an inventory of what functionality is lacking or out of spec when you submit the project. The better you can describe your deviations from stated requirements, the more forgiving I may be during evaluation (versus trying to hide the shortcomings and hoping I do not discover them).
The more fluent you are in describing your shortcomings on accomplishing the project (ie "I didn't know how to do this" is far from fluent), the better I will be able to gauge your understanding on a particular aspect.
This can be in the form of comments in your script, or even a separate file submitted at time of submission (if a file, be sure to make mention of it in your script so I can be sure to look for it).
=====Submission=====
By successfully performing this project, you should have a fully functioning set of scripts by the names **pigen** and **pigrep**, which are all you need to submit for project completion (no steps file, as your "steps" file ARE the scripts you wrote).
To submit this project to me using the **submit** tool, run the following command at your lab46 prompt:
$ submit unix spf0 pigen pigrep
Submitting unix project "spf0":
-> pigen(OK)
-> pigrep(OK)
SUCCESSFULLY SUBMITTED
You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.
I'll be looking for the following:
26:spf0:final tally of results (26/26)
*:spf0:scripts effectively utilize variables in operations [2/2]
*:spf0:scripts effectively utilize command-line arguments [2/2]
*:spf0:scripts effectively utilize command expansions [2/2]
*:spf0:scripts effectively utilize regular expressions [2/2]
*:spf0:scripts effectively utilize selection structures [2/2]
*:spf0:scripts effectively utilize looping structures [2/2]
*:spf0:scripts are proper bash scripts with shabang and exit [2/2]
*:spf0:pigrep displays values in proper orientation [2/2]
*:spf0:pigrep accurately displays values as requested [2/2]
*:spf0:scripts properly manage input violations [2/2]
*:spf0:pigen operates according to specifications [2/2]
*:spf0:pigrep operates according to specifications [2/2]
*:spf0:script logic is organized and easy to read [2/2]
Additionally:
* Solutions not abiding by spirit of project will be subject to a 25% overall deduction
* Solutions not utilizing descriptive why and how comments will be subject to a 25% overall deduction
* Solutions not utilizing indentation to promote scope and clarity will be subject to a 25% overall deduction
* Solutions not done in a valid group will be subject to a 25% overall deduction