Tuesday, October 19, 2010

A Computer Science Problem

I've run into a mildly interesting problem that some of my readers may be able to solve.  (The rest of you will get about two sentences in and go, "Huh?"  That's fine!)

Here's the problem: a universe of numbers 0-99999, of which only about 6500 are used at any time.  I need a function that maps that sparse universe into a dense universe, and vice versa.  (Yes, this sounds like a hash table problem: maybe, maybe not, keep reading.)  One function converts the sparse universe number into its dense universe counterpart; the other function converts the dense universe counterpart to its sparse universe counterpart. 

If this were a one-time use, or if the numbers actually used in the sparse universe did not change, this would be an obvious hash table solution.  The problem is that while the total size of the dense universe does not change appreciably, which numbers are used in the sparse universe does change with some regularity, as some numbers are released, and others, previously released, commit some new crime, and come back into use.  There is no practical way to store the translation, and it is important that the conversion from sparse to dense be consistent, and that we do not reuse numbers in the dense universe.

Now, it is entirely possible that there is no such solution.  It strikes me as I write this that the requirement that we not reuse numbers in the dense universe means that it will almost certainly have to be as large as sparse universe.  But I would be curious to see if anyone has an elegant solution to this.

The core problem is that there is not much space for a bar code containing the offender's five digit number.  Narrow bar codes require the prisoners get close enough to the prisoner to hold the bar code up to the reader--and some prisoners really don't like correctional officers that close.  (And some prisoners, of course, you do not want to get that close to, either.)


  1. Yeah, the "no re-use" requirement makes it pretty awkward from an algorithmic perspective, especially if you want just a function that maps them, rather than a backing database.

    I can't think of a way to prevent reuse (except by the original target) other than a database, and indeed one with an external key to identify the original holder.

    Since you have to have that lying around anyway, having the database do the mapping to the small ID sounds like the obvious thing.

    (Of course, if we are really talking about prisoners and the like, why not use something you can read remotely, like RFID?

    Or an optical barcode scanner on a camera with a good zoom...

    The other thing about barcodes is that there are barcode schemes that aren't base 10, rather a lot of them.

    At my job, we generate Code-39 barcodes for scanner lookup, and Code-39 allows A-Z, 0-9, and some punctuation.

    I believe it also gets you denser representation in terms of data-per-linear-area then UPC.

    And all of this, of course, ignores the newer 2D barcode methods, which pack even more information into less space.

    As is so often true, perhaps the solution is physical, not logical.)

  2. We use Code-39 extended. 2D barcode methods? I think I saw some mention of this. I'll take a look.

    Unfortunately, we are constrained by budgets and existing badge sizes. Unfortunately, the Treasure of Sierra Madre solution ("Badges? Badges? We don't need no stinking badges!") isn't avaiable.

  3. How about a bitmap, compressed with run-length encoding? You're sure to have many long runs of zero bits.

  4. That's only 12 kilobytes of data. How small does it need to be?

  5. The difficulty here is that we can't very easily save the mapping. As it happens, I think we have solved the problem in another way.

  6. RFID chip. The worse the prisoner the more you turn up the power in the "reader". They don't want to wear a band, stick it under their skin.

  7. if you know when the old numbers are retired, and have a lookup table as to where they are in the dense universe, you can simply overwrite the old dead number with the next new number in the list when it is assigned. this sounds like a simple database sort problem to me. Anything is possible if you have enough computing power and storage available....=b

    But remember this is from a knuckle dragging mechanical designer who's pinnacle of computer programming was playing with quickbasic 4.5....

  8. Of course you could save the state a lot of money and just shoot the bastards instead...

  9. Uh, no. There are some really, really frightening, evil people in our prisons. But the best words to describe many of them are: sad; tragic; pitiful. There are a lot of prisoners who could probably do okay in a world without alcohol, meth, and other intoxicants. But they can't avoid doing incredibly stupid things when loaded.