
Tuesday, March 30, 2021


 I am still hoping to hear back from Soda PDF; their customer service is apparently pretty backlogged at the moment.  In the meantime, I am evaluating FoxIt and Nitro Pro.  They are similarly priced ($125 vs. $128), but FoxIt is China and Nitro Pro is Australian; putting money into an ally, not an enemy or puppetmaster is preferable.

Both are chugging along OCRing the 1841 Maine session laws, in parallel.  I was not expecting a clear performance advantage of one over the other, although FoxIt seems to be moving a bit faster.  The real tests will be how accurately they do this.  How they handle the accursed margin notes on session laws will be of interest.

So far, Nitro Pro looks simpler to use.  Still waiting for FoxIt's Word output to open.  This was the entire session laws of Maine 1841.  The Word file FoxIt produced was unreadable by Word.

Nitro Pro also failed to produce a readable Word file.

NAPS2 (free) only produces a PDF containing text.  But you can copy the text from the PDF viewer:

Chapter 169. RESOLVE in relation to the Military road. - Resolved, That the sum of twenty-five hundred dollars be and the same is hereby appropriated, for the repair of the military road and the bridges upon the same, from the Penobscot river to Houlton ; and the Governor and Council are hereby authorized to appoint an Agent to superintend the expending of the same, at a compensation not exceeding two dollars per day, upon such bridges and such portions of said road, as he shall think best for the interest of the State.

I had to do a little editing of the marginalia that are always on session laws.  Curiously, it will not do it twice, unless I am doing something wrong.   Third time, it worked.  Mystery.

1 comment:

  1. Are you aware of any OCR software where I can use something like the snipping tool to just highlight what I want converted without having to upload an image or document separately? Currently I'm snipping an image of the the text and running it through the OCR program.
