This post was written more than four years ago. The world changes fast, and the information, conclusions, or attributions may or may not still be accurate. Check the sources and links, and email me if you have any questions.

As much as I’m a fan of government datasets being opened up for the public, researchers, journalists, and innovators to use, there’s also the issue of individual privacy. Sometimes, unfortunately, transparency and privacy are mutually exclusive — but I think there’s a better way for us to think about individual privacy in the information that government collects about citizens through everyday government administration of services.

Government has to collect information on individuals to do its job: the issuance of driver’s licenses, the court systems, consumer protection, tollway or transit passes, and real property ownership are all things we expect that government will collect and store information about us on, and citizens are generally okay with that. But laws that protect that information don’t always keep up with the technology that government uses to run those programs — and even with strong laws, there’s a potential for snooping government employees to improperly access and abuse citizens’ personal data.

I wrote about a perfect example last year: Automated License Plate Recognition (ALPR) systems, which law enforcement agencies use to automatically read license plate data. In Minnesota, police started using those systems before there was any legislation guiding proper use and especially before there was any legislation protecting the data captured and stored by those systems. The result was that the data — absent protections under state statutes — was public information and accessible to anyone and everyone.

If you’ve ever been arrested, you’ll quickly find that your mugshot will end up on the first page of Google results for your name. If you want it gone, prepare to pay up. These unscrupulous websites file government data requests with jails across the country to obtain mugshot and arrest records — classified as public data under the law — and they post those images and optimize your name for search engines. It’s a dark business model preying on public records.

There’s been many situations where government programs move faster than legislation to protect the data those programs generate: tollway and mass transit programs that are run by government might not necessarily have laws protecting the logs of everywhere you go, or municipal programs that require you to obtain a $25 pet license to use an off-leash park might be exposing your name, address, or even bank account information to anyone who wants it — inside or outside of government.

Citizens should be most concerned about open government laws not protecting their personal privacy when it comes to large corporations snatching up that data, indexing it, and using it to build creditworthiness profiles about you or using it to market to you.

Open government laws generally create an assumption that all government data is public unless otherwise classified.  I think a better way to consider privacy inside the realm of open government laws is to create an additional assumption that any data government generates about individuals on an involuntary basis is private… with exceptions, to be sure.

If you contribute to a political campaign, get convicted of a violent felony, or hold a real estate license, these are all voluntary activities that you can make a conscious decision on whether or not you’re okay with data being opened up about you in that context.  But, if you’re a victim of a crime, it’s a perfect example of a situation where a government agency is involuntarily collecting information about you — but right now, crime victim information is public in most states, with certain exceptions.

Here’s a few more examples:

  • Being born is involuntary, but your birth certificate enters the public record immediately upon your birth. So much for ‘what’s your mother’s maiden name and date of birth’ security questions.
  • Serving as a public official is voluntary, so expect your name and financial conflicts of interest to be publicly accessible.
  • Being a government employee is voluntary and paid with taxpayer dollars, so expect your salary, benefits, and negative information in your HR file to be public information.
  • Voting is voluntary, but isn’t a Constitutionally-protected right in some respect involuntary? Current laws make public your voter registration data, voting history, and in some states your party preference.

Even with the involuntary-data-is-private framework, there’s plenty of grey areas:

  • Mugshots: Committing a crime is arguably voluntary, but a mugshot is an involuntary generation of data as part of that arrest. So, perhaps the arrest and criminal charges are public and the mugshot is private — with exceptions for police conducting lineups and manhunts. And to throw a wrench in it, what about wrongful arrests and convictions? Current laws make all arrest records and mugshots public, even if you were wrongfully arrested.
  • Civil lawsuits: Being a plaintiff is voluntary, and being a defendant is (arguably) involuntary. Should there be a system where a defendant’s name doesn’t become public until/unless there’s an adverse verdict? I’d argue this deserves an exception and that all court data should be public — you know, justice.
  • Insurance complaints: it’s a voluntary act to file a complaint with an agency that oversees insurance providers, but if that’s the only reasonable recourse a consumer has, it’s also involuntary in a way. Making complaints public is adversarial to the interests of ensuring an honest insurance system, as most consumers wouldn’t want their information being shared.
  • Real property: buying a home is voluntary, but paying taxes on that property is not. Does that mean that property tax records should be private, but tract ownership data should be public? Also, getting a mortgage is voluntary but it’s not a function of government in and of itself — your mortgage company filing your mortgage documents with the government (which they do) is involuntary. Current laws make all of this public.

Existing legal frameworks are terrible when it comes to context. In Minnesota, if you want to know what happened in a traffic accident, it’s private data in the context of the official crash record report, but it’s public in the context of police response data. Your driver’s license number is private data, but officers will read that number right over the radio. Not only can it be intercepted, it’s also public data.

If laws were enacted to lock-down a bunch of personal information on citizens, work that researchers and journalists do to uncover trends and big picture statistical information would surely be hampered. For that situation, I think healthcare has it right: you have an absolute right to privacy, but in the interests of the advancement of the medical sciences and building better care and payment models, your data might be deidentified or summarized in a big anonymous file.

Luckily, at least in a few states, this ‘summary data’ has special protections that guarantee it will be public. So, in that case, why not make a better effort to preserve individual privacy? It’s only going to get worse.