So this is a little different from the usual fare here, but my colleague Tom Lee at the Sunlight Foundation has been thinking about using basic cryptographic concepts to convince governments to publish more unique identifiers in their datasets -- even when the identifiers they have in their *databases* is sensitive (like SSNs).
The problem of anonymizing unique data is in some senses easier than others here, because in some gov't contexts, making things personally identifiable isn't the problem -- the *intent* is to publish personally identifiable, connect-able information, like for campaign donors and lobbyists. So the Mosaic Effect (de-anonymizing Netflix data) is less of a concern. Depends on the problem, though.
After talking about it on a
couple of
lists, Tom blogged it up: