A quick guide for journalists: how to spot new domain registrations, recently-issued SSL certificates, and new servers to report on political, business, and government initiatives.

When candidates for public office announce their run, their website tends to already be setup to allow donor cash to start flowing in. When new businesses and products launch, a polished marketing presence is frequently online before their announcement. Companies and candidates merely thinking about something frequently register domain names just to make sure nobody else snatches it up.

What they don’t always think about, however, is that—absent some very careful planning—everything on the internet is tracked and searchable. I thought I’d write a little bit about how journalists can use that data to be the first to break news. Don’t worry, I won’t get too technical.

*

Spotting new domain registrations

Every time a domain name is registered, it’s included in a list that gets distributed among internet operators. This data is frequently used by domainers—people who buy and sell domain names for profit—and by companies wishing to protect their brand. Some vendors provide limited free access to search new registrations, and others offer advanced search capabilities and even registration and change alerts. As you can imagine, this can be pretty useful if you’re a reporter.

DomainTools is the leading company in this industry. They provide a variety of subscription services, including historical registration data that can be used for investigative purposes, but they can get rather expensive.

I like using Domain Punch, which offers free searches of new registrations on common top-level domains (like .com and .org). As an example, I’m in Minneapolis, so I frequently search for phrases like “minnesota” and “mpls”. Searching “mn” can fill the results rather quickly, but “formn” (as in ‘for Minnesota’) is frequently used by political candidates.

Here’s an example: using Domain Punch, I searched “Minnesota” on a given day, and it shows six domains were added. Most of these are likely spam, but a few of them interest me. “minnesotapops.com” could be a new business, and “maryforminnesota.com” sounds like a political candidate.

Sites that have been recently registered usually don’t have any content on them yet, but domain registration data—also known as WHOIS data—is often interesting. DomainTools provides a free WHOIS lookup service, though it’s important to know that this data can be forged to make it look like someone else, and the data can be obscured by using a private proxy registration service.

In the case of “minnesotapops.com,” I can see it’s registered to a prominent musician. If that was something I was interested in researching or reporting on, he could be launching something new and I’d likely be one of the first people to find out.

“maryforminnesota.com,” however, is far more interesting to me. WHOIS data shows it’s registered to the mayor of a metro-area city, suggesting that she might be considering a run for governor. As I search some of my other common phrases, that suggestion becomes rather clear:

There’s been no mention of her candidacy in news articles, in public Facebook posts, or on Twitter, so I might be one of the first to know, simply by using domain registration data.

*

Certificate Transparency logs

Another way to find new websites is by using certificate transparency (CT) logs.

When a website obtains a TLS certificate—also known as an SSL certificate, which encrypts data and shows the lock icon in your browser—some certificate authorities issuing those certificates report the information publicly. This is called certificate transparency, and it’s a growing trend because it allows web browsers to have better information to ensure users are on legitimate websites.

These CT logs can be searched on a number of websites, and crt.sh tends to be my favorite because it includes an RSS feed of search results if you’d like to setup alerts. Keep in mind though, not all certificates will be included.

As an example, I’ll search “%formn.com” and “%formn.org” – the % is the wildcard symbol.

Most of these are just certificate renewals for existing websites, as certificate authorities issue certificates that expire after a period of time: 12 months at most authorities, or three months at Let’s Encrypt. You’ll see a lot of Let’s Encrypt certificates because they’re free.

I see “maryformn.com” again in these logs, which suggests to me that not only is this person considering a run for governor, but perhaps actively building a website.

I’ll do another search for “%mpls.com” and “%mpls.org” (the abbreviation for Minneapolis):

There’s many businesses in the list, particularly restaurants and bars in Minneapolis. Again, it’s mostly renewals, but it would be a good place to spot a new business coming to town, or a new political candidate.

One thing interests me: “dev” and “stage” URLs for “achievempls.org”. In the web development industry, new websites are frequently built at “development” or “staging” links before being pushed live. In this case, the site has already been published and this is just a renewal. But had I spotted this earlier, I might have been able to get a peek at a new website before the official launch.

Searching “%.mn.gov” or “%.stpaul.gov,” for example, is a good way to see new initiatives, new technologies, and the use of previously-unknown vendors within those government agencies. That information can lead to a media inquiry or a public records request.

Searching “%.uber.com,” as another example, shows a number of internal services that Uber uses, and a number of internal projects. This could be used for a competitive advantage for other companies, or by a journalist writing about Uber’s activity.

*

Shodan

Finally, there’s Shodan—the “search engine for the Internet of Things.”

A typical search engine like Google crawls and indexes websites, and websites are generally served from port 80 (HTTP) and port 443 (HTTPS). But there’s many more port numbers: 65,535 to be exact. For example, File Transfer Protocol (FTP) runs on port 21, the MySQL database server runs on port 3306, and the Disney game Club Penguin uses port 6113.

Shodan does something called port scanning, which can at points be in a legal and ethical grey area. It’s not hacking, but rather just initiating a communication attempt to a specific port number to see if anything is there to respond. You can think of it like calling phone numbers sequentially just to see which ones are active. Sometimes, when a service responds on a port number, it divulges “banner information”—which is a server’s way of picking up the phone and saying “Hello, this is ___.”

Shodan archives all that data and makes it available for full-text searching, along with allowing for advanced searches by IP address range, organization name, domain name, and more.

Back to the topic of ethics, though. Port scanning itself is generally accepted as legal, while attempting to actively authenticate to a service without permission is generally thought of as not legal. Of course, there has been prosecution activity for port scanning, but those cases tend to be against hackers that had ill intent and caused damage. The great part about Shodan is that the service has already performed the scanning; you’re just searching it’s results, which are publicly posted on the internet for anyone to use. Whether you use it or not, the port scanning has already been done. I think that resolves any ethical quandaries.

The information in Shodan can reveal what technologies government or businesses are using, the names of new projects or services, and more.

For example, I searched for all FTP (File Transfer Protocol) servers at the University of Minnesota, and found 303 results:

Most of the results are for unsecured surveillance cameras and printers—that’s a scary story in itself—but there are some file servers. Shodan shows you the banner information:

This might reveal information useful to formulate a story, to ask further questions, or to launch a freedom of information request.

As another example, I’ll search for all websites in Cook County by matching “.cook.mn.us,” and I find they have a public-facing server that appears to be for issuing permits. Again, if that was something I was interested in, it might be useful to ask more questions or formulate a data request.

(Sorry, Mary.)