Making text difficult for OCR?

Dr. Evil drevil at sidereal.kz
Thu Aug 9 11:14:20 PDT 2001



Decoy writes:
> Some ideas: Start with a highly ornate script font. Anti-alias. Try a font
> with lots of gaps and other topology breaking features. Pluck out a decent
> perceptual model from one of the better image compressors and try doing
> maximum modifications beneath a given perceptual error bound. Low contrast,
> with information encoded in the hue channel.

Thank you, excellent sugestions.  I put in the anti-aliasing thing and
that is clearly something which will piss off an OCR device.  Here's
an example of my current attempt at something which would be
neigh-impossible to OCR reliably:

http://www.sidereal.kz/~drevil/anti-ocr.png

What do you think?  It's got some shear, some rotate, some spread, and
some swirl.  And anti-aliasing, which was a great sugestion.  I'm
going to experiment with adding some dotted lines, too.

Ken Brown writes:
> Write by hand. Change your handwriting style part-way through.

As you may be aware, Doctors have notoriously bad handwriting, but
alas, it must be machine-generated.

> If the colours were chosen carefully, off-the-shelf OCR would also be
> confused because the contrast of the edges would be colour, not
> intensity. OTOH it would be circumventable by enhancing colour contrast
> then switching to monochrome & enhancing edges. Anything you can do in

Right, screwing around with color contrast may not be effective at
all, because those things are very easy to take out by just editing
the color map, and suddenly it's sharp and clear.

> 10 minutes with Paint Shop Pro is probably doable by the Men In Black
> sooner or later. 

Ah, but the Men in Black are not the threat in this case.  Au
contraire!  This is part of an anti-money-laundry thing.

> As an afterthought, the experts in this must be the people who print
> banknotes. Real ones, I mean, not your boring US green ones that are
> all the same size and colour so foreigners can't tell them apart and
> you have to employ millions of Secret Service agents to stop
> forgers. I bet the Bank of England go on about it on their website.

I'll check there.  They are mostly concerned with a different problem,
which is duplication.  I want to make stuff that humans can read, and
machines can't.  It's all going to be PNGs, so machines can duplicate
it no problem.





More information about the cypherpunks-legacy mailing list