Overall most students did quite well. Comments are in your svn direcotries.
mean | 54.4 |
---|---|
standard deviation | 6.24 |
-
Print out the entries (orthography only) from the celex.txt file which were taken from GOOGLE. Hint: You will need to use a pipe. (6 points)
grep GOOGLE celex.txt | cut -f2 -d '\'
-
Print out the 50 most frequent words from the celex.txt file which were taken from GOOGLE. Hint: You will need to combine the answers from the last 2 questions. (9 points)
grep GOOGLE celex.txt | cut -f2,3 -d '\' | sort -t '\' -k 2,2rn | head -n 50
OR
grep GOOGLE celex.txt | sort -t '\' -k 3,3rn | head -n 50 | cut -f 2,3 -d '\' - Use unix commands to count the number of entries (not definitions) in the devil’s dictionary that begin with a vowel. Your output should be a single number. (7 points)
grep -Ec '^[AEIOU][A-Z-]*,' devilsDictionary.txt
227 - Use unix commands to calculate the average number of letters per word for each entry (not the definitions) in the Devil’s Dictionary. The output should simply be a number. HINT: You will need to use subshells, and bc (10 points)
entries=`grep -E '^[A-Z]+,' devilsDictionary.txt |cut -f1 -d ','|wc -l`
letters=`grep -E '^[A-Z]+,' devilsDictionary.txt |cut -f1 -d ','|wc -c`
echo "$letters/$entries"|bc -lOR, in one fell swoop
echo "`grep -E '^[A-Z]+,' devilsDictionary.txt |cut -f1 -d ','|wc -c`/`grep -E '^[A-Z]+,' devilsDictionary.txt |cut -f1 -d ','|wc -l`"|bc -l - Count the number of adjectives, nouns, and verbs in the devil’s dictionary. (10 points)
noun=`grep -cE '^[A-Z]+, n\.' devilsDictionary.txt`
verb=`grep -cE '^[A-Z]+, v\.' devilsDictionary.txt`
adj=`grep -cE '^[A-Z]+, adj\.' devilsDictionary.txt` - Print out all the entries (not the definitions), which are not adjectives, nouns, or verbs. HINT: use grep more than once. (10 points)
grep -E '^[A-Z]+, ' devilsDictionary.txt |grep -vE '^[A-Z]+, (v|n|adj)\.' | cut -f1 -d '.'
- Write a unix pipeline which will print the number of words in the celex.txt file that contain a q not followed by a u (look only at the orthography of each entry). (8 points)
cut -f2 -d '\' celex.txt |grep -Eic 'q[^u]'
EVEN BETTER
cut -f2 -d '\' celex.txt |grep -Eic 'q([^u]|$)' -
Extra credit
Write a unix pipeline which will print the total number of points in this assignment. Don’t include the points for the extra credit (3 extra points) (Hint: use dc)
echo "`grep -oE '[0-9]+ points' hmwk2.solution |cut -d ' ' -f1` ++++++ p"|dc