Thursday, June 13, 2019

SED Sucks, But AWK does not work as expected

$ echo '"Horrid Barbarity," Cramer [N.Y.] Weekly' | sed 's|\"\([A-z0-9 ,\.\']*\
"\) \( [A-z0-9\[\]]*\)|\1$,$\2|g' |more

I am missing something.  It should match quote, followed by any number of letters, digits, brackets, commas, periods, ending with a quote as field 1 and all letters and numbers, blanks and brackets as field 2.

I have become so frustrated that I tried to use filter in Excel to extract only lines without an entry in the URLs column.  Does not work.  I may just give up on trying to get the missing URLs.

2 comments:

Rick C said...

An idea for Excel: Details in column A, URL in column B. Grab the column separator between A and B and drag it to the left until you've only got a little bit of it viewing; that should hide the excess text. Then click on the column B heading and sort it ascending. Answer Yes to the next dialog--it's going to ask if you want to reorder the other columns to keep each row together, and that's the default behavior. Then all the rows with no URL will be at the top. I do that all the time. Alternatively try my suggestion in your last post.

JLW III said...

Why not ask the man who wrote the book.

https://www.cs.princeton.edu/~bwk/