Design and use regular expressions
In this assignment you will form various regular expressions. We will be working with the wordlist found on most linux installations. Mine is found at
/usr/share/dict/words. If you do not have this file you can install it with a
sudo apt-get install wamerican. All of the regular expressions can be tested by doing the following command:
egrep "EXPRESSION" /usr/share/dict/words. For each answer below, write down the EXPRESSION that you used to find the answer in a .txt file. There may be more than one answer for a particular question. I will also have you pipe the above expression to the
wc command to make sure you have the required number of lines. You should NOT use any other pipes. All results should be contained within a single regular expression unless otherwise indicated.
- Here is problem one that I am going to walk you through. Search for all words that begin with the string “foo” and that end with an “s”, I would type the command:
egrep “^foo.*s$” /usr/share/dict/words
I would look at it to see if I am getting the desired results. Once I am sure that I am getting the desired results, I would copy that command to my assignment file that I am going to turn in for submission and I would then re-run the command like this:
egrep “^foo.*s$” /usr/share/dict/words | wc
And I would get this output:
51 51 542
I would then copy that output to my result file.
So my output file for question 1 now looks like this:
1. egrep "^foo.*s$" /usr/share/dict/words => 51 51 542
I would continue this type of output for the following problems.
- Search for all words that begin with the string “foo” and that end with an “s” (as shown above)
- Look for all words that contain the string “ijk”.
- Find all words that contain the string “pilot” or “gically”. That would mean that it should find copilot as well as pilots as well as logically. HINT: wc should return 31 lines.
- Find all words that have 4 successive vowels.
- Find all words that have exactly 20 characters.
- Find all words that end with ‘ux’.
- Find all words that start with a ‘b’ or a ’s’ and have a ‘zz’ somewhere later on in the word.
Try this expression:
^(.).\1$. The ^ matches the beginning of the line. The . means any character. The . within ( ) means capture this as a sub‐pattern. The next . matches any character. The \1 refers to the pattern captured in the ( ). The $ means match the end of the line. Now find all five letter words that start and end with the same two letters, except that the last 2 letters are reversed (a palindrome).
radaris an example.
Do the same thing as number 8 but find all words that start and end with the same 3 letters (the last 3 letters would be in reverse order from the first 3). The word could be 6 or 7 characters long. Hint: I only found 2 words in the list that met this criteria.
Find all words that have an aeiou (in order). Each vowel appears only once in the word. I only found 3.
Display all words that have 3 consecutive double-letter pairs (like “bookkeeper” has oo, kk, ee)?
Find all words that have an apostrophe, but do not end in the letter ’s’.
Find all words that consist of only 2 lowercase letters.
Check off procedure:
Upload your txt file.
Last Updated 12/22/2020