Wednesday, July 18, 2012

Any PDF Experts Out There?

Here is a detailed report by someone who purports to be a PDF expert, who consulted to the Maricopa County Sheriff's Department's team on this.  I know enough about this subject to know that I don't know enough to tell whether she is right, wrong, or overly sure of herself.

I mentioned last year that one of my students was showing me Obama's birth certificate and all the layers when you open it in Illustrator--and I saw something that she missed: different parts of the birth certificate appear to have been scanned at different resolutions:
Diagonal lines, when scanned, produce a jagged set of pixels.  This is an artifact of how sharply drawn the line is, and the scan resolution (dots per inch). All things being equal, two lines drawn at the same angle should produce similar levels of jagged pixels.  Yet when I looked at the mother's maiden name, "Dunham" at 800% in Adobe Acrobat, I noticed that the diagonals on the "D" are very, very noticeably different in their pixelation than the diagonal lines of the letters in the rest of the name.
click to enlarge
The attendant's name (the doctor who delivered the child with the halo over his head) has the same jagged, low-resolution characteristics as the "D" in Dunham--but not the "unham."  (This was 600%.) 


click to enlarge
As I said last year, this alone does not prove fraud, but it is most curious.

Now, it doesn't really matter if the birth certificate is a fraud.  Once the mainstream media have decided that an idea is nonsense, you have to give up, and work on persuading Americans that they should not re-elect a guy who has destroyed a bad economy.  But I am curious to know if this report is correct or not, mostly out of intellectual curiosity.

7 comments:

Anthony said...

It looks like the 'D' didn't come out very solidly, and she either picked up a different pen or shook out the pen, or pressed more firmly, and the 'unham' came out much more solidly. You can see how the 'D' fades from the beginning of the stroke to the end, and in your second image, you can see the same thing happening in the registrar's signature.

When the contrast is lower, the JPEG algorithm will create more artifacts around writing, which you can also see in the registrar's signature.

If it's a forgery, it's a paper forgery; there doesn't seem to be anything unusual going on in the scanning process.

Clayton said...

That's a good explanation for the jagged D. She might have picked up a ball point pen that was dried out, realized that it wasn't working well, and switched to a different pen.

I would love to hear an evaluation of the claims made about the multiple layers, color variations, and the oddity of scanning a paper document that produced so many layers.

hga said...

I am a document imaging expert from the 1991-7 period, and while that's before MRC (Mixed Raster Content) became important (before we could afford to store, even compressed, color images), I read up on it and satisfied myself that the "official" green background copy was consistent with it. Specifically with combining a microfilm or -fiche gray scale scan with a random green safety paper background, then having a MRC system compress it (the press also got one with a blue background; I'm reasonably confident that at this point the only "original" that exists is in a microformat).

I didn't read any of the report except for the part that's supposed to debunk the idea that it's MRC compressed, and it's entirely unconvincing.

If I feel like it, or if you really want to know (just ask, you know your minions' email addresses :-), I'll look at the rest of it, but if it's at the quality of the MRC "debunking" it isn't worth much.

Jim said...

There are several youtube videos that explore Obummer's birth certificate including the issue of "unham". Some explain pretty clearly how that could not have been done by a pen - regardless of the type. This is the first of a series that looks pretty accurate: http://www.youtube.com/watch?v=7s9StxsFllY.

Art Deco said...

I would put this in the file marked "David Lifton". This mess grows more and more convoluted, I would wager because people have the time and energy to make it that way.

PhaseMargin said...

I'll chime in and say hga is right. Anything in the image processing field that uses compression uses a DFT to throw away high frequency information that your eye can't process anyway. It's why JPGs are reasonable sizes and why movies don't take a hundred disks. It's also why you can get 5 digital TV channels or more in one old analog channel.

But the use of JPGs is lossy (you throw away a lot of information) and it intentionally makes things more jagged on diagonals. As to the "variable resolution", there are some quality metrics on how much high frequency to maintain, and that tends to yield what looks like variable resolution upon reconstruction.

I guess the one paragraph summary is this: JPGs are an array of 8x8 pixels upon which DFTs are run, with the resultant frequency information filtered and stored. You can choose the quality and resolution stored (anything from perfect to useless) depending on how much storage you have. Artifacts become more pronounced with higher compression.

hga said...

A couple of quibbles on PhaseMargin comment: JPEG at least uses the discrete cosine transform, "a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers." I've never had the opportunity to learn either, but the latter and the JPEG method in general do what he described. E.g. Step 2: "The resolution of the chroma data is reduced, usually by a factor of 2. This reflects the fact that the eye is less sensitive to fine color details than to fine brightness details."

Analog VCRs do the same, and for that matter the original sin of Never Twice the Same Color (NTSC, the US analog broadcast standard) was simply adding a chroma signal to the existing B&W luminance signal to retain backwards compatibility with B&W TV sets.

The other quibble is that for pure B&W compression different methods are used. When I left the field CCITT Group IV compression was used as the best method. It's pretty cool, a bit MPEG like, in that it combines run length encoding with encoding of line-by-line changes, since those tend to be minor. E.g. think of encoding the middle of a C from top to bottom, each scan line will vary only a little from the previous as the black of the C move right to left and then reverses.

I've not studied gray scale compression so I don't know what methods are used for it.