Corning Community College

CSCS1730 UNIX/Linux Fundamentals

Project: SCRIPTING PI FUN (spf0)

Errata

any bugfixes or project updates will be posted here

Toolbox

In addition to the tools you're already familiar with, it is recommended you check out the following tools for possible application in this project (you may not need to use them in your solution, but they definitely offer up functionality that some solutions can make use of):

basename(1)
bc(1)
cut(1)
diff(1)
grep(1)
mktemp(1)
sed(1)
tr(1)
wc(1)

Objective

To create two scripts that perform some operation in the domain of calculating digits of PI, and searching its digits for patterns of substrings.

Background

PI is an important mathematical constant. It is used in various applications, and we often find ourselves making use of it.

But what about generating it? That's part of what this project seeks to have us explore.

We will also, once successfully generating pi out to a variable amount of digits, write a script to allow us to search those digits for numeric substring patterns.

Calculating PI

There are various methods for calculating PI, which take varying amounts of processing power and time to perform, and have different levels of accuracy.

We will want to come up with a process that is accurate, yet perhaps also relatively simple to express.

You may want to check out the method using arctan to accomplish this:

lab46:~/src/unix/spf0$ echo "4*a(1)" | bc -lq
3.14159265358979323844

Here we see that the default precision of bc(1) is to scale to around 20 digits.

With that, if we check a database of PI digits (which I have conveniently stored on lab46, in the /usr/local/etc/pi.1000000 file), we can compare our results against the verified, correct results (note that the database omits the period '.' separating the 3 from the 14…, be sure to compensate for this):

lab46:~/src/unix/spf0$ cat /usr/local/etc/pi.1000000 | cut -c1-21
314159265358979323846

It would seem that all but the last value is correct. In which case, we'd want to strip that off and retain the valid component in our explorations.

Process

It is your task to write 2 scripts; one will be responsible for generating a file containing calculated digits of PI (and you must perform the calculations… no skimping on this), and the other will use that produced data to search through the digits of PI in order to find occurrences of patterns. Further details follow:

script1: pigen

You are to write a script, called and submitted by the name pigen (no extension, although it is to be a bash script, with proper shabang, comments, and solution).

By default, if you run the script by itself, it should calculate and produce (in a file in the current directory), a file by the name of pi.100.out, which will contain the first 100 digits of PI (lacking the period '.' that separates the 3 from the rest of the irrational number):

lab46:~/src/unix/spf0$ ./pigen
lab46:~/src/unix/spf0$ ls
pi.100.out  pigen
lab46:~/src/unix/spf0$ cat pi.100.out
3141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067
lab46:~/src/unix.spf0$

If you provide a command-line argument, it must be a valid numeric value between 1 and 2000, and will calculate that many digits of PI, saving it in a file pi.#.out, where # is the number of digits requested.

You also need to verify your results against the PI digit database on lab46 (do not copy this file… set a variable to it for easy access), and display the message “MISMATCH” if your calculated/processed results do NOT match the validated PI computation out to that indicated quantity of spaces.

Additionally:

Your script is to only produce a pi.#.out file in the current directory.
- ANY temporary files need to be created in (and subsequently removed from) the /tmp directory, of a relatively unique name autogenerated by the mktemp(1) tool.

script2: pigrep

Your second script will be called pigrep (no extension, although it is to be a bash script, with proper shabang, comments, and solution), and will perform searches on a pigen-generated pi.#.out file for specified patterns (numerical substrings) within the generated digits.

By default, if you run the script by itself, it should generate the following error (and exit with a non-zero value):

lab46:~/src/unix/spf0$ ./pigrep
ERROR: must specify PATTERN
lab46:~/src/unix/spf0$

It will have online usage information, displayed when the 'help' option is provided anywhere on the pigrep command-line:

lab46:~/src/unix/spf0$ ./pigrep help

 pigrep - search available pi digits via regex for matches;
          must be part of pipeline (send PI digits in via STDIN)

   usage: pigrep [OPTION...] PATTERN

    note: if MAX variable is set, cap processing at that value

 options:

  atend - calculations are based on last digit of match
 byline - output one value per line (default is space-separated)
  drop3 - do not include leading 3 of pi (*3*14) in processing
 offset - determine offsets of matches (from start of digits)
   help - display this help and exit

lab46:~/src/unix/spf0$

Your script should also take the following arguments (in this order):

required argument:
- PATTERN: a string containing the numeric pattern you are looking for.
  - for example: '76'
optional arguments:
- atend: by default, offset calculations are based on the first (left-most) digit of the pattern; with this option, compute the offset based on the last (right-most) digit of the pattern
  - atend is meant to be used in conjunction with offset, by itself it does not alter default processing
- byline: by default, matches are space-separated. With this option, display one match per line
- drop3: do not include the leading 3 of PI in processing
- offset: determine offsets of matches from start of digits
- help: display usage information

You are also to check for the existence of a MAX variable, and if set to a valid, positive non-zero decimal (base 10) number, will cap the amount of results it processes / outputs.

Numerical arguments are to be given as valid, positive non-zero decimal (base 10) values.

With any of these arguments validly provided, they should adjust the script's processing and output accordingly.

Also to keep in mind:

Your script is not to produce any files as a result of operating.
- If your solution calls for ANY temporary files, they need to be created in (and subsequently removed from) the /tmp directory, and be of a relatively unique name autogenerated by the mktemp(1) tool.

Some sample outputs follow:

Only specifying numeric pattern

In the event only a pattern is provided, search through the PI data for the provided pattern, displaying each match to STDOUT.

For example:

lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep '26'
26 26 
lab46:~/src/unix/spf0$

one result per line

Using the byline option, instead of displaying results horizonally, they'll be displayed vertically (one result per line) to STDOUT.

lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep '26' byline
26 
26 
lab46:~/src/unix/spf0$

compute offsets from start of digits

The offset argument will, instead of displaying the numeric matches (which, aside from counting are of dubious value) will display how many digits away from the start of the PI digits the pattern resides.

lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep '26' offset
7 22 
lab46:~/src/unix/spf0$

With an example like this, we should easily be able to verify its correctness:

lab46:~/src/unix/spf0$ cat pi.120.out
314159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214808651328230664
lab46:~/src/unix/spf0$ cat pi.120.out | cut -c7-8,22-23
2626
lab46:~/src/unix/spf0$

mixing options (offset, byline)

And we can mix options (in any order):

lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep byline '26' offset 
7 
22 
lab46:~/src/unix/spf0$

Using atend with offset

The atend option, issued by itself, has no impact on operations:

lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep atend '26'        
26 26 
lab46:~/src/unix/spf0$

So we need to combine it with the offset option to make an impact:

lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep atend offset '26' 
8 23 
lab46:~/src/unix/spf0$

dropping the leading 3

With the drop3 option, we merely exclude the leading 3 of pi from our calculations. This should result in everything being “off by one” from previous outputs (with any combination of arguments):

lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep atend offset '26' drop3
7 22 
lab46:~/src/unix/spf0$

Using MAX to limit processing

Sometimes, a pattern may produce an untenable quantity of results. We may wish to restrict it, and can do so with the aid of setting the MAX variable.

Take this request, which produces numerous results:

lab46:~/src/unix/spf0$ cat pi.120.out | ./pigrep '2'                    
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
lab46:~/src/unix/spf0$

We can cut that down to a smaller quantity by setting MAX accordingly (let's say, limit to 4 matches):

lab46:~/src/unix/spf0$ cat pi.120.out | MAX=4 ./pigrep '2'
2 2 2 2 
lab46:~/src/unix/spf0$

You could also export MAX, mitigating the need to manually specify it for each run.

Specifications

Evaluation will be based on correctness of values as well as on formatting/spacing.

You'll notice that everything lines up and is positioned similarly:

groupwork! This project is specifically designed to be performed in groups.
- I want to see groups ranging from a minimum of 2 people, to a maximum of 4 people. Any less and any more are not allowed.
- EVERY group member needs to be identified in your scripts (in a comment listing everyone).
- EVERY group member needs to be familiar with the end products
- EVERY group member needs to submit a copy of the scripts (they should all be identical- I will check)
  - in the event you feel other group members have not lived up to their obligations, you may alter your scripts accordingly (be it not giving them the final product, or enhancing functionality to bring it in compliance, or leaving comments indicating who wasn't pulling their own weight).
- EVERY group member needs to do an approximately equal portion of the work. Slackers should be called out (I'll be keeping an eye out as well) and they will lose credit.
- This project is designed to be done DURING class time. It may offer up some useful insights into other projects (that you're doing on your own).
You need to check your arguments to ensure they are present and valid.
- all mentioned arguments implementing their indicated functionality (atend, byline, drop3, offset, help)
  - help takes priority over other options
  - atend has no appreciable impact unless offset is also specified
- any invalid arguments should be silently dropped/ignored.
your script needs to commence with a proper shabang to run using bash; your script needs to end with an “exit 0” at the very end
comments and indentation are required and necessary
- comments should explain how or why you are doing something
- indentation should be consistent throughout the script (no mixing of different indentation units; no mixing of tabs and spaces)
- indentation is to be no less than 3 on-screen spaces (I recommend tabstops of 4).
continuing with our shell scripting, your scripts will need to employ in a core/central way (note that both scripts may not each need all of these, but across both scripts, you should make sure that each of these concepts is utilized):
- variables
- command-line argument parsing and usage
- command-line pipelines
- command expansions
- regular expressions
- conditional/selection structures
- loops
your logic needs to:
- flow (one thing leads into the next, as best as possible)
- make sense within the given context
- avoid redundancy
- be understood by you, and everyone in your group (no grabbing snippets that seem to “work” from the internet)
  - if you gain inspiration from some external resource, please cite it
  - comments are a great way of demonstrating understanding (if you explain the why and how effectively, and it isn't in violation of other aspects, I'll know you are in control of things)

To be sure, I'll be checking to make sure you solution follows the spirit of what this project is about (that you implement functional, flowing logic utilizing the tools and concepts we've learned, in an application that helps demonstrate your comprehension). Don't try to weasel your way out of this or cut corners. This is an opportunity to further solidify your proficiency with everything.

Spirit of project

The spirit of the project embodies many aspects we've been focusing on throughout the semester:

recognizing patterns to employ effective solutions in problem solving
utilizing concepts and tools covered
demonstrating comprehension of concepts, tools, and problems
employing concepts in knowledgeable and thoughtful manner
following instructions
implementing to specifications
utilizing creativity
being able to control solution via consistent, clear, and organized presentation
approximately equal involvement and impact of all group members on the brainstorming and development of the finished scripts.

Basically: I want your solution to be the result of an honest, genuine brainstorming process where you have figured out a path to solving the problem, you have dabbled and experimented and figured things out, and you can command the concepts and tools with a fluency enabling you to pull off such a feat. Your solution should demonstrate the real learning that took place and experience gained.

Cutting corners, avoiding work, skimping on functionality, cheating through getting others to do work for you or finding pre-packaged “answers” on the internet violates the spirit of the project, for they betray your ability to pull off the task on your own.

Identifying shortcomings

I would also like it if you provided an inventory of what functionality is lacking or out of spec when you submit the project. The better you can describe your deviations from stated requirements, the more forgiving I may be during evaluation (versus trying to hide the shortcomings and hoping I do not discover them).

The more fluent you are in describing your shortcomings on accomplishing the project (ie “I didn't know how to do this” is far from fluent), the better I will be able to gauge your understanding on a particular aspect.

This can be in the form of comments in your script, or even a separate file submitted at time of submission (if a file, be sure to make mention of it in your script so I can be sure to look for it).

Submission

By successfully performing this project, you should have a fully functioning set of scripts by the names pigen and pigrep, which are all you need to submit for project completion (no steps file, as your “steps” file ARE the scripts you wrote).

To submit this project to me using the submit tool, run the following command at your lab46 prompt:

$ submit unix spf0 pigen pigrep
Submitting unix project "spf0":
    -> pigen(OK)
    -> pigrep(OK)

SUCCESSFULLY SUBMITTED

You should get some sort of confirmation indicating successful submission if all went according to plan. If not, check for typos and or locational mismatches.

I'll be looking for the following:

52:spf0:final tally of results (52/52)
*:spf0:scripts effectively utilize variables in operations [4/4]
*:spf0:scripts effectively utilize command-line arguments [4/4]
*:spf0:scripts effectively utilize command expansions [4/4]
*:spf0:scripts effectively utilize regular expressions [4/4]
*:spf0:scripts effectively utilize selection structures [4/4]
*:spf0:scripts effectively utilize looping structures [4/4]
*:spf0:scripts are proper bash scripts with shabang and exit [4/4]
*:spf0:pigrep displays values in proper orientation [4/4]
*:spf0:pigrep accurately displays values as requested [4/4]
*:spf0:scripts properly manage input violations [4/4]
*:spf0:pigen operates according to specifications [4/4]
*:spf0:pigrep operates according to specifications [4/4]
*:spf0:script logic is organized and easy to read [4/4]