\\
Corning Community College
\\
UNIX/Linux Fundamentals
\\
\\
Lab 0x9: Introduction to Pattern Matching
\\
\\
~~TOC~~
======Objective======
To explore patterns, their creation, and identifying both included and excluded members thereof.
======Background======
As defined by dictionary.com, a pattern is:
A consistent, characteristic form, style, or method, as a composite of traits or features characteristic of an individual or group.
We deal with patterns everyday. As subtle as the dotted yellow lines on the road to creating secret ciphers, the ability to create and detect patterns is an important analytical skill that will greatly help you in almost any field of study (very much so for Computer Science and related technical fields).
Through your traditional educations you've all been exposed to simple pattern matching.
For example, take the following pattern:
. . ., -8, -7, -6, -5, . . .
Given this much resolution to the pattern, we can make some assumptions as to its characteristics. If we wanted to generate the next 3 numbers in the sequence, it wouldn't be that far fetched to say that: **-4**, **-3**, **-2**
are the next three valid numbers in this pattern. Why? Because we've looked at the existing values and have determined a relationship between them. **-8** is one away from **-7**, **-7** is one away from **-6**, and so on. The rate of change between each number is constant.
Analyzing a pattern in this way can greatly help in creating continuations.
^ 1. ^|Continue the following patterns (at least 4 additional iterations), list their rates of change, and any other observations (if it is a known pattern, identify what it is):|
| ^ a.|2, 4, 6, 8, 10, 12, . . .|
|:::^ b.|What is the rate of change between the pattern elements?|
|:::^ c.|Does the pattern conform to any recognizable named pattern? If so, then which one(s)?|
|:::^ d.|2, 4, 8, 16, 32, . . .|
|:::^ e.|What is the rate of change between the pattern elements?|
|:::^ f.|Does the pattern conform to any recognizable named pattern? If so, then which one(s)?|
|:::^ g.|0, 12, 24, 36, . . .|
|:::^ h.|What is the rate of change between the pattern elements?|
|:::^ i.|Does the pattern conform to any recognizable named pattern? If so, then which one(s)?|
|:::^ j.|Z, Y, X, W, V, . . .|
|:::^ k.|What is the rate of change between the pattern elements?|
|:::^ l.|Does the pattern conform to any recognizable named pattern? If so, then which one(s)?|
Not so hard, right? Your mind has been trained to pick up simple patterns very easily. So instead of just analyzing existing patterns, we're also going to exercise the brain a bit and make our own patterns.
^ 2. ^|For each of the following pattern descriptions, generate 6 sequential elements, and list their rate of change (to the best of your ability):|
| ^ a.|Odd numbers|
|:::^ b.|What is the rate of change between each consecutive odd number?|
|:::^ c.|Natural numbers|
|:::^ d.|What is the rate of change between each consecutive natural number?|
|:::^ e.|Prime numbers|
|:::^ f.|What is the rate of change between each consecutive prime number?|
|:::^ g.|Your own pattern (be sure to identify it)|
|:::^ h.|What is the rate of change between each consecutive number in your pattern?|
=====Exclusive members of a pattern=====
So far we've been studying valid members of a pattern.
For example, in the pattern containing all even numbers, 4 is a valid number for that pattern, but a number like 37, or the letter G, however, is not.
Computer Scientists and Mathematicians take this whole pattern business a lot farther. In fact, they refer to such things as "languages", and have a special syntax for defining a language, and proving whether or not a given pattern or sequence is a member of that language. Going more into depth on such topics is more suitable for a "Theory of Computation" course, so we will just focus on the stuff useful for us in UNIX.
So, back to locating non-members of a pattern:
^ 3. ^|Given the following patterns, generate 3 non-members, and explain why they do not belong:|
| ^ a.|3, 6, 9, 12, 15, . . .|
|:::^ b.|56, 48, 40, 32, . . .|
|:::^ c.|a, e, i, o, u|
|:::^ d.|bob, wow, mom, radar, . . .|
Patterns can be finite or infinite in nature, all depending upon what you're after. Examples of infinite patterns include the positive even numbers. Finite patterns are things like the lowercase vowels of the alphabet.
Elements of a pattern can be of a single character or digit, or can be an entire string or substring.
Sometimes a pattern is much deeper than it appears on the surface- where you must understand the meaning (or semantics) or significance of its members in order to isolate non-members.
^ 4. ^|Given the following patterns, find the element(s) that doesn't fit, and explain why. Come up with at least two answers for each:|
| ^ a.|duck, duck, duck, goose; What doesn't fit, and why?|
|:::^ b.|duck, duck, duck, goose; What doesn't fit, and why?|
|:::^ c.|how, now, brown, cow; What doesn't fit, and why?|
|:::^ d.|how, now, brown, cow; What doesn't fit, and why?|
|:::^ e.|cow, monkey, snake, cat, human; What doesn't fit, and why?|
|:::^ f.|cow, monkey, snake, cat, human; What doesn't fit, and why?|
|:::^ g.|mercury, venus, earth, mars, jupiter; What doesn't fit, and why?|
|:::^ h.|mercury, venus, earth, mars, jupiter; What doesn't fit, and why?|
Once again the ability to create patterns comes into play- so that we can create patterns that will match only the items we want to match, leaving out the rest.
For example, take the following items:
* milk
* grape juice
* bread
* orange juice
Looking at these 4 items, we can come up with some criteria. If we want to find a likely exclusion, we can group the items by liquid vs. solid. In this case, bread would be excluded because it is the only solid.
So if we wanted to create a pattern to exclude bread, our pattern would be of liquids.
Alternatively, if we wanted to refine our match to include, say, only the juices, we would only select grape juice and orange juice, leaving the rest out.
So where does all this come into play with computers? Well, file processing can benefit greatly from pattern matching.
For example, if we had a directory filled with several thousand files, and we wanted to copy out all the files that ended with **.txt**, we learned that we can do the following:
lab46:~$ cp *.txt somewhere_else/
Using the ***** wildcard, we are able to establish a pattern where we match all files that have 0 or more of any character __and__ end with: **.txt**
This way, we need not care about what the files are named, how many characters, or anything else, as long as it ends in **.txt** it will match.
The file wildcards can be very handy, especially when doing pattern matching.
======More Pattern Practice======
Let's take a look at some more patterns to get more practice.
With any of these patterns, there is often more than one solution. Try and come up with multiple solutions, trying to look at the problem from many different perspectives. That will only help in developing your ability to better think "outside the box" and to think critically.
^ 5. ^|What pattern am I describing?|
| ^ a.|doh, ray, me, fah, so, lah, tee, doh; What is the pattern?|
|:::^ b.|doh, ray, me, fah, so, lah, tee, doh; What is the pattern?|
|:::^ c.|See figure 1 below; What is the pattern?|
|:::^ d.|See figure 1 below; What is the pattern?|
|:::^ e.|See figure 2 below; What is the pattern?|
|:::^ f.|See figure 2 below; What is the pattern?|
|:::^ g.|See figure 3 below; What is the pattern?|
|:::^ h.|See figure 3 below; What is the pattern?|
|:::^ i.|See figure 4 below; What is the pattern?|
|:::^ j.|See figure 4 below; What is the pattern?|
|:::^ k.|Spring, Summer, Autumn, Winter; What is the pattern?|
|:::^ l.|Spring, Summer, Autumn, Winter; What is the pattern?|
__Figure 1:__ What is the pattern?
B A G A B B B\\
A A A B D D\\
B A G A B B B\\
B A A B A G\\
__Figure 2:__ What is the pattern?
Puff, the magic dragon, lived by the sea\\
And frolicked in the autumn mist in a land called Honalee.\\
Puff, the magic dragon, lived by the sea\\
And frolicked in the autumn mist in a land called Honalee.\\
__Figure 3:__ What is the pattern?
31 29 31 30 31 30\\
31 31 30 31 30 31\\
__Figure 4:__ What is the pattern?
1. Great Pyramid\\
2. Hanging Gardens\\
3. Statue of Zeus\\
4. Temple of Artemis\\
5. Tomb of King Mausolus\\
6. Colossos\\
7. Pharos\\
=====Pattern exclusion=====
^ 6. ^|Of the presented patterns, which item doesn't belong and why? (Come up with three different responses for each):|
| ^ a.|May, June, July, August; Why?|
|:::^ b.|May, June, July, August; Why?|
|:::^ c.|May, June, July, August; Why?|
|:::^ d.|New York, Rhode Island, New Jersey, New Mexico; Why?|
|:::^ e.|New York, Rhode Island, New Jersey, New Mexico; Why?|
|:::^ f.|New York, Rhode Island, New Jersey, New Mexico; Why?|
|:::^ g.|Maple, Oak, Elm, Pine; Why?|
|:::^ h.|Maple, Oak, Elm, Pine; Why?|
|:::^ i.|Maple, Oak, Elm, Pine; Why?|
|:::^ j.|pineapple, licorice, orange, cherry; Why?|
|:::^ k.|pineapple, licorice, orange, cherry; Why?|
|:::^ l.|pineapple, licorice, orange, cherry; Why?|
|:::^ m.|Kirk, Khan, Kira, Spock; Why?|
|:::^ n.|Kirk, Khan, Kira, Spock; Why?|
|:::^ o.|Kirk, Khan, Kira, Spock; Why?|
|:::^ p.|sun, beach, sky, stars; Why?|
|:::^ q.|sun, beach, sky, stars; Why?|
|:::^ r.|sun, beach, sky, stars; Why?|
=====Continuing patterns=====
^ 7. ^|Given the following patterns, what comes next, and why?:|
| ^ a.|Triangle, Circle, Square, Triangle, Circle, Square, _________________|
|:::^ b.|Why does your chosen selection come next?|
|:::^ c.|30 days has september, april, june, and _______________|
|:::^ d.|Why does your chosen selection come next?|
|:::^ e.|alpha, beta, gamma, ______________|
|:::^ f.|Why does your chosen selection come next?|
|:::^ g.|1/2, 2/4, 4/8, 8/16, ...|
|:::^ h.|Why does your chosen selection come next?|
In the **patterns/** subdirectory of the UNIX Public Directory are several test files. Change into this directory to work on the following question.
^ 8. ^|Using **ls**(**1**), list files that match the following patterns:|
| ^ a.|Only files that begin with a lowercase vowel.|
|:::^ b.|All files that end with a **.s**|
|:::^ c.|All files that do not start with a vowel, yet end with a .s|
|:::^ d.|Only 3 character filenames that don't end with a consonant.|
|:::^ e.|Explain your strategy and reasoning for constructing an effective pattern for each part of this question.|
Patterns are an important concept that we need to get a good hold over, in order to really start to work impressive feats of UNIX wizardry. Take a stab at these, and see where we end up.
======Conclusions======
All questions in this assignment require an action or response. Please organize your responses into an easily readable format and submit the final results to your instructor.
Your assignment is expected to be performed and submitted in a clear and organized fashion- messy or unorganized assignments may have points deducted. Be sure to adhere to the submission policy.
The successful results of the following actions will be considered for evaluation:
* your responses to questions submitted at the following form:
http://lab46.corning-cc.edu/haas/content/unix/submit.php?lab9
\\
* the response from the form (received via e-mail) saved as **lab9.txt** to your **~/src/unix/** directory
* addition/commit of **~/src/unix/lab9.txt** into your repository (CS 0x0 sets you up to do this).
As always, the class mailing list and class IRC channel are available for assistance, but not answers.