Saturday, June 8, 2019

Awk: Why Did I Not Think of This?

Awk on a comma-separated file is a problem because commas can appear inside quotes.  Excel allows you to export tab-separated files as well.  So:

awk -F"\t" '{print $33}' MassMurderCurrent.txt


will show the Articles column.  With this, I can separate article title, and date.

What am I doing wrong?  I used to do amazing things with regex (like fix 30,000 records with the wrong incarceration date: there were patterns to how Corrections Officers screwed up mm/dd/yy.)  I am trying to replace every expression of the form: '  \d, with ' 0\d,'.  (One digit day number to two digit day number.)

sed -e 's/ {\d,/0$1}/g'

Does cygwin sed even work?  This seems no simple:

echo "Aug. 1," | sed -e 's/Aug. [0-9],/Aug. 0$1,/g'

outputs Aug. 1.  Where is the 0 before $1?  And \1 instead of $1 does not work either.  Is cygwin obsolete regex?

Figured it out.

echo "Aug. 1," | sed -e 's/\(Aug.\) \([[:digit:]]\)),/Aug. 0\1,/g'

Grouping.

Only works under linux, not cygwin.  If there was an easy way to share files betwen my Linux emulator and Windows, I would do this all under Linux.  Unfortunately, none of the instructions for installing VMWare Toools work, so sharing files means using FileZilla.  Maybe that is next.

But when I transfer my transform script over, it shows as being uploaded and present in the virtual Linux directory, but not in the Linux shell in that directory.  It may be time for my afternoon nap, when nothing works at even the most trivial level.

Nap helped not all.  I will write a C program to extract the needed information.

5 comments:

  1. Keep an eye on the time you're spending on this. NAS enclosures can be had on ebay for $60, just add HDD. Have both machines attach to your NAS and you're done.

    ReplyDelete
  2. Good idea. I already have an NAS attached. I am going to try and see if the virtual Linux can see it in the morning.

    ReplyDelete
  3. Many wireless routers also support plug-in of a drive and sharing it as a network disk.

    ReplyDelete
  4. Virtual Linux cannot see it. The VMWare Tools need to be installed, and no directions that I can find actually work.

    ReplyDelete
  5. Have you tried Linux Subsystem for Windows under Windows 10? I've had better luck with those utilities than I've had with cygwin recently. It's Canonical's Linux, so it's pretty good at least for terminal applications/utilities, and it goes straight to your Windows files with no hassle.

    ReplyDelete