Recently, when speaking of a cyber case, I said that if your criminals have got an IQ of 101 or greater, and if they’re not pathologically lazy, they’re going to anonymize their traffic to the email accounts they use to manage the accounts they open on the Internet retailers. You’re not going to get – or, more precisely, they’re not going to give you – an IP address leading to their house.
I went on:
“Over the past, say, decade – the period over which the old “IP-address-is-a-phone-number” saw made its way from the exasperated cyber cop who was trying, through clenched teeth, to explain something to his super-annuated boss, to a criminal justice system legend as difficult to shake as the one about alligators in the Cleveland sewer system – cyber criminals have recognized that leaving your real IP address on a server was the single most incriminating thing they could do. In that same timeframe, the cost and complexity of obscuring one’s true IP address have both been reduced to, literally, “free” and “one click”.
That’s true enough, but we’ve gotten several questions about this from DAs and cops in several states, so this post will talk a bit about basic operational security (OPSEC) from the standpoint of how criminals (and everyday citizens, actually) attempt to avoid (or make more difficult) being tracked on the Internet.
We’ll come back to this topic several times because it’s got huge ramifications for law enforcement and prosecutors.
And I will say this again: to get this right, we will have to mess up and lose cases, seize things with the best of intent and the most probable available cause, only to discover we were wrong. If we don’t do that, we will stay exactly where we are, which is at the primordial, left-end of the cyber crime timeline.
I am not by any means suggesting that we can neglect our duty to establish probable cause – we must tirelessly strive to protect the constitutional rights of everyone.
But because we have entered uncharted territory, we need to set some precedent – and setting precedent has never and will never be a walk in the park.
From What Are They Hiding?
To understand what people are trying to hide through their OPSEC, we should probably first have a look at what they’re hiding from.
On the Internet, basic attribution – the term of art which means “Who dunnit” – is accomplished through your IP address and your computer information or environmentals. Both can be forged.
In this Part I, I’ll talk a bit about the IP address. In Part II, I’ll talk about computer environmentals. Then in Part III I’ll talk about methods of covering one’s tracks on the Internet – tools, techniques, tactics and procedures, or the TTTP, if you will, of Internet OPSEC.
Then, when we’re done with the background, we can get to the meat of it: what are some of the ways law enforcement agencies can find to build probable cause without an IP address; what can prosecutors ask officers to produce in terms of probable cause that is within the realm of the possible?
If you haven’t read the basics, go ahead to Wikipedia and read what they have to say on the subject, then come back.
The old saw about an IP address being like an Internet phone number is somewhat correct but also totally wrong. There is a range of reasons why this is true, but the one we in law enforcement are concerned with is understanding what happens at a certain IP address.
Specifically, an IP address can hide a few, hundreds, thousands or tens of thousands of other computers, known or not known, published or non-published, subpoena-able or non-attributable. A “phone number,” on the other hand, even a phone number in Syria, Uganda or Belgium, indicates a level of billing and account security proffered by a managing PTT or private company like AT&T, assuring that everything that happens on that number is managed and auditable. Not that it is cast in stone; see “caller-ID spoofing” for more information, but generally speaking, this is true.
The important thing is that IP addresses are used to route traffic across the chaotic, free-flowing, groovy-ass Internet, and not necessarily to provide attribution, integrity, security or accountability. You can see that on networks when routers and switches do the digital equivalent of yelling, “YO! Where’s Murray?” and the protocol is built in such a way that pretty much anyone can say, “Right here!” and that’s good enough to get the traffic.
Network Address Translation (NAT) means that one device – say, a router – can have a single IP address and then translate the traffic from that single address to the scores or thousands of computers behind it which are not visible from the public Internet.
Let’s go – briefly – back to the “phone number” analogy. NAT is similar to the concept of making a phone call from work: you hand out the number of the main switchboard, and when someone dials that, the operator connects the call from the public phone network to your private extension. The caller doesn’t know – or need to know – whether you are the only person at the company, or whether you’re one of thousands.
Many hosts behind the router have private IP addresses, relevant only to the local network you’re on, but when you want to present yourself to the world, you go through the main “switchboard”, or router, and present a single “main number”.
Let me pull you away, now, from the phone number thing: NAT is dissimilar to the concept of making a phone call from work by the very dynamic nature of the IP itself: there’s no capability of billing for or attribution to any specific host behind the NAT device, and no “phone company” can tie back any host behind the NAT. The entity owning and operating the device controls its knowledge of what traffic went where, and if they don’t keep records, or won’t share them, you’re SOL.
This might sound confusing, but you already do this yourself at home – if you have a WiFi hotspot or a network, you present one IP address to the world, which is translated at the box into which you plug your cable, which in turn doles out packets to the boxes that need it. Your kid’s multi-player game traffic, your wife’s business reports and academic research and your streaming Rocky & Bullwinkle cartoons are thus all routed to the right computer on your home network.
Proxy servers on the Internet work pretty much the same way – if you take your computer and tell it that the way to get to the public Internet is through an anonymizing proxy, your browser doesn’t care, your home router doesn’t care – in fact nothing stops you from claiming to be coming to the public Internet from that proxy server, because you are.
In the diagram to the right, the proxy server stands between Charles (who is requesting information from Jonas) and Jonas (who responds). To Jonas, the request comes not from Charles, but from the Proxy; in fact, Jonas does not know that Charles exists.
One would typically encrypt the traffic from one’s home computer network to the proxy server – which can be located in a foreign country – so no one can intercept it en-route. Then, from the proxy server, you make your request to get to the carder forum or the retail shop website on which you place fraudulent orders. To the carder forum or what have you, the only person asking for the information is an anonymous proxy, behind which all is a black, mysterious hole.
Put another anonymizing proxy in the stream – that is, go from your connection to a first anonymizing proxy, thence to another anonymizing proxy, and thence to the public Internet – and you’ve made your web-surfing very slow, but finding you very difficult indeed.
The lesson here: just because it says the traffic is coming in to a site from Ulan Bator or Ougadougou, doesn’t mean it originated there.
Noting the Declared IP Address
The IP address from where someone hits a given resource – a “resource” in this context being a service on the Internet, like a server, an email account, a website, etc – is listed in the request for the resource. It is then logged with information about what the browser or email client is asking for.
For example, if we take a look at a log entry for a request to my personal website today, we see this:
22.214.171.124 – - [11/Jun/2012:07:57:31 -0400] “GET /articles/technology/index.htm?a=1810 HTTP/1.1″ 200 12893 nickselby.com “http://www.google.co.in/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CFgQFjAB&url=http%3A%2F%2Fnickselby.com%2Farticles%2Ftechnology%2Findex.htm%3Fa%3D1810&ei=Jt3VT72IFYqjiAfgjMGSAw&usg=AFQjCNEBs1SNKnNfzfMDdL8LkYQERZL_Ww&sig2=gVEWnwWHRzeIgk9VQVJhWQ” “Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.19 (KHTML, like Gecko) Ubuntu/12.04 Chromium/18.0.1025.151 Chrome/18.0.1025.151 Safari/535.19″ “-”
The first thing there we see is the IP address
With just that address there are many open source tools to help bring some context and information, not all of which can be trusted but all of which helps to build a profile about the address that you can use to analyze the request. The most basic information about a website, including the name and address and contact details which were claimed at the time the domain was registered, comprise the WHOIS record for that domain or address.
Note, there is nothing preventing anyone from registering with totally false information, though there are many clues contained within WHOIS records that the attentive analyst can make use of. For example, if someone has used similar information in other WHOIS records, you can tie two or more WHOIS records together. There’s more on this on Wikipedia.
[Update: shortly after this article ran, the FBI and the DEA apparently got all hand-wringy about the change form IPV4 addresses - which have this kind of WHOIS data- and the new IPV6 addresses - which don't. Quoted in an article on CNET about this, an FBI supervisory special agent apparently suggested that, "a new law may be necessary if the private sector doesn't do enough voluntarily" - that is, if there aren't changes to the way these kinds of records are kept. This will be the topic of a new post. No, not "Why the FBI apparently thinks you can legislate against stupid,” but one on how the American Registry for Internet Numbers (ARIN) and other registries work, what kind of information they have and why and how you can’t believe anything you read in the WHOIS listing other than the IP address of where the record points. And we’re covering how that may or may not indicate the actual physical location a little bit here, some more in Part III, and some more in the new article. The point is, anyone who acts as if IPV4 WHOIS records are reliable sources of true information is misinformed. Dangerously.]
The basic tool for looking up WHOIS records is called, well, WHOIS. If you have a Linux machine, it’s probably built in to your terminal, but if not you can find it online at places like (and this is not an endorsement) whois.net. Some more information can be found through GEOIP services, which work to place IP addresses to geographical latitude and longitude, but again, these are can be misleading.
While the GEOIP service (such as Quova, GEOIPTOOL, MaxMind or IP2Location) will likely tell you where the IP address ties to a server, it will not give you any information as to whether that server is actually the origination point of the traffic you’re viewing. If it’s in Kiev, Ukraine, that’s great information – unless your actual source is in Oakland, CA, USA and using anonymizing proxies to appear to be coming in from Kiev.
The best way to look up IP information is through Team Cymru’s IP Tools Page. You’ll remember that Team Cymru is an absolutely outstanding source of OSINT – if you haven’t read about them before, read our piece on OSINT tools, or search our search box for “Cymru” for the mentions we’ve given them in the past.
A quick WHOIS lookup on the IP address I mentioned above tells us that the computer asking for that page on my website is supposed to be in India:
nick@zappy:~$ whois 126.96.36.199 % [whois.apnic.net node-2] % Whois data copyright terms http://www.apnic.net/db/dbcopyright.html inetnum: 188.8.131.52 - 184.108.40.206 netname: IIIP-568310-Indore descr: IMPETUS INFOTECH INDIA PVT. LTD descr: 10 C descr: Ratlam kothi Geeta Bhavan descr: SDH room descr: Indore descr: Madhya Pradesh descr: India descr: Contact Person: Anil Khandelwal descr: Email: firstname.lastname@example.org descr: Phone: 9893110872 country: IN admin-c: NA40-AP tech-c: NA40-AP mnt-by: MAINT-IN-BBIL mnt-irt: IRT-BHARTI-IN status: ASSIGNED NON-PORTABLE changed: email@example.com 20120224 source: APNIC route: 220.127.116.11/24 descr: BHARTI-IN descr: Bharti Tele-Ventures Limited descr: Class A ISP in INDIA . descr: 234 , OKHLA PHASE III , descr: NEW DELHI descr: INDIA country: IN origin: AS9498 mnt-by: MAINT-IN-BBIL changed: firstname.lastname@example.org 20050812 source: APNIC person: Network Administrator nic-hdl: NA40-AP e-mail: email@example.com address: Bharti Airtel Ltd. address: ISP Division - Transport Network Group address: Plot no.16 , Udyog Vihar , Phase -IV , Gurgaon - 122015 , Haryana , INDIA address: Phase III, New Delhi-110020, INDIA phone: +91-124-4222222 fax-no: +91-124-4244017 country: IN mnt-by: MAINT-IN-BBIL changed: firstname.lastname@example.org 20110307 source: APNIC
OK, so we know that that traffic is supposed to have come from India. Do we know anything else? We could ask some of the folks who do nothing but look for IP addresses that are tied to criminal activity and ask them, so let’s go over and ask Watchguard Reputation Authority what they think of 18.104.22.168 and we find that it’s in New Dehli, that it’s got a neutral listing – neither good nor bad – and that WatchGuard hasn’t seen any spam or viruses or other bad stuff.
Which is not an indication of anything other than what it is – that that IP comes back to that geographical location and that we don’t have any further context. Well, actually we do – we have the name and purported contact details of the registrant of the site, and some other contextual information, like the name
Bharti Tele-Ventures Limited
and some other stuff. More on that later.
The funny thing though, is that the requested resource – in this case “Article 1810″ was referred to my site by Google.in (you can see that in the log – and that’s consistent with the traffic coming in from India), and the search was for – wait for it – how to set up a proxy server using free tools.
The article it got on my website, Setting Up Squid, then Using It was written by me in 2007, and covers setting up proxy servers.
So we can see that the topic is of current interest to people.
If you look at the rest of the logfile above, you can see that there is some basic information claimed about what kind of browser and computer was making the request.
Parsing that, and capturing more and more information about computers visiting a site, is the subject of Part II.
Thanks to Mike Kearn for an early read and pre-edit!
 Although, you know, that would be a good post.