User Tools

Site Tools


Sidebar

projects

ntr0 (due 20200826)
pct1 (bonus; due 20200821)
wcp1 (due 20200821)
pct2 (due 20200826)
wcp2 (due 20200826)
adm0 (due 20200902)
pct3 (bonus; due 20200902)
wcp3 (due 20200902)
pbx0 (due 20200909)
pct4 (bonus; due 20200909)
wcp4 (due 20200909)
pbx1 (due 20200916)
pct5 (bonus; due 20200916)
wcp5 (due 20200916)
pbx2 (due 20200923)
pct6 (due 20200923)
wcp6 (due 20200923)
usr0 (due 20200930)
pct7 (bonus; due 20200930)
wcp7 (due 20200930)
gfo0 (due 20201007)
pct8 (due 20201007)
wcp8 (due 20201007)
upf0 (due 20201014)
pct9 (bonus; due 20201014)
wcp9 (due 20201014)
upf1 (due 20201021)
pctA (due 20201021)
wcpA (due 20201021)
wpa0 (due 20201028)
pctB (bonus; due 20201028)
wcpB (due 20201028)
pctC (due 20201104)
wcpC (due 20201104)
pctD (bonus; due 20201111)
wcpD (bonus; due 20201111)
pctE (bonus; due 20201118)
wcpE (bonus; due 20201118)
eoce (due 20201125)
haas:fall2020:unix:cs:cs9


Corning Community College


UNIX/Linux Fundamentals



Case Study 0x9: Fun with grep

~~TOC~~

Objective

To explore the grep family of pattern-matching utilities.

Reading

The REGULAR EXPRESSIONS section of the grep(1) manual page.

Background

The grep(1) utility has been used in prior assignments in the context of “searching for” a pattern in a file.

grep wasn't just arbitrarily made up out of the blue. It has its roots in the regex functionality of vi/ex. (regex or regexp is an abbreviation of sorts for regular expression). grep is actually an acronym:

GREP = Globally [search for] Regular Expression [and] Print

GREP can be used by itself on the command line or in conjunction with other commands (especially with pipes). You always need to supply your regular expression or pattern to grep.

There are a few different 'grep's in existance. Aside from the original grep, there is also egrep and fgrep.

variant of grep description
grep original grep. Accepts basic regex in search patterns
egrep grep that accepts extended regex metacharacters in search patterns
fgrep grep that accepts no regex. Just takes literal strings. Also called fast grep

So what is the difference? It all depends on what you want to do. For most cases, grep will be suitable.

However, sometimes we wish to add a little more capability to our RegEx patterns. The “e” in egrep stands for “Extended Regular Expressions”, and it adds some operators to our list of available Regular Expressions, including:

metacharacter description
( ) group patterns together
| logical OR, can be used for condensing many patterns together
+ match 1 or more of the preceding
? match 0 or 1 preceding

The big advantage of egrep is that we can combine together patterns with an OR, allowing once again for the creation of single patterns that can apply to a wide range of situations. For example:

egrep '(Mon|Tues|Wednes|Thurs|Fri|Sat|Sun)day' somefile.text

Would match any day of the week. By utilizing the parenthesis to group together and the | to OR together, we have a single pattern that can be used to locate items of this pattern.

Exercise

Copy the pelopwar.txt file from the grep/ subdirectory of the UNIX Public Directory into your home directory.

1. Using the grep(1) utility, show me how you get the following:
a.How would you find all occurrences of the substring “coast”
b.How many matches?
c.How would you find all unique occurrences of “coast” as a word (ie not as part of a word)
d.How many matches?
e.How would you find all occurrence of the word “Dorian” that are the last word on the line.
f.How would you find all lines that begin with the word “Again”
2. Using egrep(1), show me how you get the following:
a.All instances of “Athens” or “Athenians” that occur at the beginning of the line.
b.How would you accomplish this with just one call to egrep(1) using extended regexp's?
c.All instances of “Corinth” or “Corinthians” that occur at the end of the line.
d.How would you accomplish this with just one call to egrep(1):
3. Using fgrep(1), show me how you get the following:
a.Try the same regexp that you used in #2a
b.Did you get any results?
c.Explain the outcome.
4. Create and show me the command-lines to:
a.Display the first 4 lines of the last(1) command's output that begin with any of your initials. Include your output too.
b.Using sort(1), alphabetize the output of the lastlog(8) command and use grep(1) to search for all lines that begin with a lowercase 'e' through 'g', then print out the first 4 lines of the resulting output.

NOTE: For #4b, the head(1) utility can be used to help format the output with respect to some of the given specifications.

Conclusions

This assignment has activities which you should tend to- document/summarize knowledge learned on your Opus.

As always, the class mailing list and class IRC channel are available for assistance, but not answers.

haas/fall2020/unix/cs/cs9.txt · Last modified: 2013/10/29 16:59 by 127.0.0.1