This post was written more than four years ago. The world changes fast, and the information, conclusions, or attributions may or may not still be accurate. Check the sources and links, and email me if you have any questions.

It seems everyone loves open and accessible government data when the application of that data matches their values and interests, but the moment it contradicts our own personal boundaries, we’re fast to call it a violation of privacy. That’s exactly what happened when the Minnesota State Patrol announced that they would be live-tweeting names of drivers arrested for drunk driving during a special enforcement event.

But before I get into that, let me provide a couple other examples of selectively inconvenient open data.

Example 1: Licensing data
Let’s say that you’re an insurance salesperson and you have a child. You think it’s great that you can go online to verify a school teacher’s licensure status and any complaints and filings because it ensures the quality of your child’s education. But as an insurance salesperson, you’re probably licensed by the same government entity that licenses teachers, and it’s all accessible online to ensure your clients’ quality of financial representation. How do you feel about your clients and competitors being able to look up your licensure status and complaints? Does it change anything if that government data is easily viewable in Google with a simple search of your name?

Example 2: Sex offender data
As another example, let’s say you’re still that parent and your child just turned 18. A sex offender recently moved into your neighborhood. You think it’s great that the police are so proactive in spreading around their mugshot and personal information so your family can watch out. Your child gets his 17-year-old partner pregnant through consentual sex, her parents find out, and your 18-year-old son gets charged with statutory rape and child molestation because you live in a state without a close-in-age exception. Your child’s mugshot and personal information get spread out to your neighbors and he’s forever classified a sex offender. How do you feel about that? Does it change your opinion if the government actively promotes that data by optimizing a sex offender registry page for search engines, or by flyering a neighborhood with posters? Does it make it not so bad if all of the court documents are easily accessible and searchable, and that they reveal the specific set of circumstances so people can see that it’s just a technicality based on a bad law?

These examples present an issue of values — we value having convenient and accurate information in order to have awareness and make decisions. That information and data is even more important when there might be wrongdoing by someone in a perceived position of power, and the availability of the full set of circumstances is important so that individuals can take in all of the data and draw their own conclusions.

So back to drunk drivers.

Arrest data in Minnesota is already public under the law, and publishing information about individuals arrested for crimes is nothing new. Journalists publish police blotters, mugshots, criminal complaints, and allegations every single day. Regardless of the presumption of innocence, arrested parties who end up published in an online news source might get their trial done online in comment threads and the real punishment might be permanence and visibility of information in Google. This abuts the issue of the online publication and search accessibility of mugshots, which I’ve written about before.

Public data vs. published data

So, there’s clearly some difference between data that is technically public under transparency laws and data that is actively published by a news agency or government, with the latter being perceived as being “more public.” But when government agencies do the publishing of some data, but not all data, they’re perpetuating an agenda and taking an active role in curating a story.

In this case, the Minnesota Department of Public Safety’s agenda is to curb drunk driving. I love that agenda, but what I don’t love is that only some information is selectively accessible.

In many counties and municipalities, to actually get police reports and arrest data, you have to submit a request under that state’s open government law(s). In Minnesota, state law gives police an extraordinary amount of time to respond to that request, especially if you aren’t the subject of the data (e.g. if you aren’t the victim or suspect). And when you get the data back from that law enforcement agency, it’s more than likely not going to be in a consumable digital format. At best, you’ll get a PDF and at worst, you’ll get printed pages of paper — partially due to ineptitude and partially with an intent to stop digitization and consumption of data.

If law enforcement agencies made machine-readable JSON files of individuals being arrested, along with all of the relevant charge and bail data, some amazing tools could be created by journalists and community activists. Law enforcement could focus on upholding the law, and journalists and developers could build:

  • An app for concerned citizens that sends you an iPhone alert when someone is arrested on your block
  • An app for news media to find celebrities and public officials’ names in arrest reports
  • An app for researchers to do large-scale data analysis of crime trends and arrest discrepancies
  • or… the ability to tweet out every single arrest for no reason at all.

If government data is open and universally accessible, the door is open to innovation… but it’s also open to potentially painful scenarios.  If you’ve ever been in a car crash, you’ve surely received piles and piles of solicitations from ambulance chasers — that is, massage therapists, chiropractors, and law firms looking to make a buck from insurance payouts.

But here’s the thing: in the large counties in the Twin Cities, all arrests are already processed in online systems and searchable.  It’s not machine readable, but it’s already public: your full name, date of birth, charges, bail, home address, and processing timestamps.  The only thing that’s being added in this story is the publication of that data by a law enforcement agency, which makes this a matter of public perception:

“I would like to see a list of all the cops that have avoided DUI arrest because they are filthy pigs”





I’m in strong agreement with:


It’s partially that, and it’s partially that arrests are such a hot topic.  Is there a government agency tweeting out every single government employee pay raise, complaint against police officers, citizen complaint against a utility company, or building code violation?  Each of those are public data.

I share in the concern about arrest data unfairly haunting people for years, and I think part of the solution could be reform of open government laws to place both requirements on government agencies to release data using live APIs and machine-readable formats, but also imposing restrictions passed by state legislatures to disallow certain uses of government data.  Because I think most people would agree with the premise of getting crime alerts on their block, but disagree with the premise of forever ruining someone’s life over a certain victimless crimes.

That said, tweeting a DWI isn’t going to ruin someone’s life, but your drunk driving could end someone else’s. Unfortunately, our law enforcement and criminal justice system leaves a lot to be desired. One could be arrested and jailed for three days simply on the reasonable suspicion and probable cause that they might be drunk. I’m no DWI lawyer, but I’ve definitely seen folks jailed for three days on probable cause DWI, and then released without charges after the 72-hour hold expires. It would be a shame for those folks to have their information plastered online if they were arrested simply because an officer didn’t like them.

“Where the photograph is taken of a prisoner who is subsequently discharged upon trial, the photograph and the records ought to be destroyed; because, if the man is innocent, there is no reason why his picture and measurements and pedigree and the crime with which he was charged should be open to the public gaze… The man has committed no crime, for a jury of his peers have said so. In my opinion, it ought to be made a misdemeanor for the Police Department to photograph or measure a man merely charged with a crime. That ought to be done only after he is convicted, and should form part of the records in the Police Department of convicted criminals.”

— Hon. Alfred E. Ommen, New York City Magistrate, c. 1905

If you can’t tell, this is a complex issue — but there is middle ground. Revealing the arrest location instead of an arrestee’s home address would be smart, and not releasing mugshots online and generally having serious mugshot data practices reform would be even better.

I’ll end with the crazy: