Lab 4 (with answers)
Your Name:
Warm Up
- Create a subdirectory called "lab4" under your home directory.
- Copy the file soap.txt into this directory. This file is available from here.
The WRONG way to save this file onto your lab4 directory is to block and copy the text with your mouse (or the "select all" option) and paste it into a file in, say emacs. The RIGHT way to save this file is to right click on the link and use the "save link as" option.
If you do it the first way, your copy-pasting might destroy the line formatting (e.g., one line could be broken down into multiple lines) and then your answers to the questions below will be incorrect.
- Before attempting the problems, make sure to set the environment variable LC_ALL to the value C. As explained in class, in the bash shell, you do this by: export LC_ALL=C. In the tcsh shell, you do this by: setenv LC_ALL C. Find out what shell you have and type the appropriate command.
Questions To Answer
- (3 points). Create a dumpster/ directory in your home directory. Switch to tsch (C-shell) and alias the "mv" command to a command that instead of erasing your file moves it the dumpster directory you've just created.
HINT: you may find the .cshrc.example file discussed in class useful.
There is a link to it in the very beginning of Lecture-4 on our class' site. Do NOT "install"/run the whole thing, just cut and paste the relevant alias setting onto your command line. Once you are done, call your TA.
alias rm /bin/mv \!:1 $HOME/dumpster
Now let's practice pattern matching. Using egrep, find out the following information about soap.txt. For each of the questions below,
write down your egrep command and also the answer.
- (1 point) Does the string "happy" appear in the file?
egrep "happy" soap.txt | wc -l
1 (so yes)
- (1 point) How many lines contain at least one occurrence of the string "soap" ?
egrep "soap" soap.txt | wc -l
26
- (3 points) How many lines contain multiple occurrences of the string "soap" ?
egrep "soap.*soap" soap.txt | wc -l
1
- (2 points) How many lines do not begin with a lowercase letter?
egrep "^[^a-z]" soap.txt | wc -l
83
- (3 points) How many lines contain a number? A number is a consecutive succession of one or more digits separated by "white space" (e.g., a space, tab, newline etc.) characters on either side. So, the line "My name is 234T" does not contain a number. But "I have 3 oranges" does. Do a man egrep and you might find the bracket expression [:space:] useful.
egrep "[[:space:]]+[0-9]+[[:space:]]+" soap.txt | wc -l
35
- (3 points) How many lines contain a two-digit number?
egrep "[[:space:]]+[0-9][0-9][[:space:]]+" soap.txt | wc -l
9
- (2 points) How many lines contain a nine-letter word? Similar to the previous two problems, a word is a succession of (uppercase or lowercase or mixedcase) letters separated by whitespace on either side.
egrep "[[:space:]][A-Za-z]{9}[[:space:]]" soap.txt | wc -l
17