Friday, May 24, 2019

More Awk

It took a while but the following extracts fields 4, 5, 9, 10, and 11 and puts | in as a delimiter, then removes the time of the incident from the 2nd field.
awk -F'|' '{print $2,"|",$4,"|",$5,"|", $9,"|",$10, "|",$11}' USATodayList.csv |sed -e 's/ [0-9]*:[0-9]*'[AP]'M//'
 Now in field 4 of the output, I want to delete all words that are not proper nouns, so if it starts with a uppercase letter, print it.

I can find examples of everything except what I want: delete words matching [a-z0-9]* in field 4.  I am guessing that some BEGIN {} construct that matches [A-Z] and prints that matching string and no other.  Help!

No comments:

Post a Comment