It has been over six months since Edward Snowden’s unprecedented NSA leaks, and we are still a long way from being able to assess the damage. Worldwide trust in United States tech companies has undoubtedly been shaken. Cisco Systems blamed a ten percent revenue drop on fallout from the leaks. Microsoft is offering the ability for foreign customers to have their data stored outside of the United States. And Silicon Valley stalwarts from Apple to Google to Yahoo have spent considerable resources defending themselves as each new embarrassing revelation becomes public. The trickle-down effect of this is even touching the small niche of digital forensics. Personal privacy has been central to the Snowden debate, and users today are more educated than ever about how their information is stored and transmitted. Web services companies are taking notice, and we have already seen some very useful artifacts disappear. I expect the trend to continue and would like to share a few examples.
On October 1, 2013, version 30 of Google Chrome was released. Absent in this release was one of the most unique browser artifacts available: History Index files. Prior to version 30, Chrome not only stored browser history, cache and cookies but also recorded a full text index of each visited page. Since page content can change, this was a wonderful forensic artifact for proving what existed on a given page when a user viewed it. Chrome version 30 not only stopped recording this information, it also deleted any existing History Index files from the user’s profile.
Secure Sockets Layer – Encryption as the Default
The Snowden revelations have provided the final push that the SSL movement needed get over any remaining objections. It is now clear that the aggregation of even the most trivial metadata can prove embarrassing, or worse. On September 23, 2013, Google rolled out SSL encryption by default for all searches. This feature has been available as an option since 2010 and was the default for Google account holders, but now every Google search will be accomplished via HTTPS. How will this affect host-based forensic examinations? Luckily HTTPS connections and search terms are still recorded in browser history. It is a myth that browsers do not cache HTTPS. In 2011, Firefox was the last of the major browsers to support the opt-out disk caching scheme using “Cache-Control: no-store” and “max-age” for content expiration. Google uses these options sparingly, and search information is still recorded on the host. Surprisingly, our biggest forensic losses with the switch to SSL are the valuable keywords collected via Google Analytics cookies.
Secure Sockets Layer and Google Analytics
Google Analytics is in use on over 15 million websites, and each one of those sites stores Google Analytics cookies on the computers used to visit them. The __utmz cookie for a domain stores the last visit time, total number of visits, how the site was discovered, and any keywords or search terms used. It is the last bit of information that disappears when SSL is in use – Google Analytics will not record search engine query information if the search was conducted via SSL. If you want this information you better hope that the user was using Bing*. Otherwise you will only see “not provided” listed in the utmctr field.
* Citing NSA activities, on Nov 18th, 2013, Yahoo announced its intention to encrypt all user traffic by the end of Q1 2014.
“Globally, 56% of those surveyed by GlobalWebIndex reported that they felt the internet is eroding their personal privacy, with an estimated 415 million people or 28% of the online population using tools to disguise their identity or location.” –TheGuardian
Paradoxically, common knowledge of NSA’s ability to defeat many forms of encryption is only encouraging more prevalent personal encryption. VPN usage, Off-the-Record (OTR) Chat, and full disk encryption usage are exploding. The current version of Truecrypt is downloaded 1700 more times per day on average than previous versions. Visits to the TAILS (The Amnesic Incognito Live System) anonymity-centric LiveCD download site grew exponentially in the months following the initial NSA leaks. The open source anti-forensics tool BleachBit saw downloads double in the six months following the initial leaks. Kali Linux took full disk encryption a step further with a patch to add an optional “Nuke” password that destroys data on the disk instead of decrypting it. More than ever, first responders need to update their processes to include encryption checks, volatile data collection, and live volume imaging.
What is Next?
It is clear that the fallout from the Snowden leaks is far from over. There has been a fundamental change in how individuals and the companies selling services to those individuals are starting to view personal information. Here are a few more changes that could be on the horizon.
X-Originating-IP has long been an excellent tool for identifying the IP address (and possible location) of the client sending an email. It is an optional field in the message header, and consequently, adoption varies. Gmail and Google Apps rarely include worthwhile information in the field. Microsoft Hotmail replaced it with an obfuscated version, X-EIP, at the end of 2012. Messages sent with Microsoft Office 365 include X-Originating-IP data, but the account administrator can turn it off. Of the large personal webmail providers, only Yahoo continues to regularly include client IP addresses, though it is simply placed in the first “Received:” field. A good guess is that this artifact will also soon disappear.
The End of Search Parameters?
Even with SSL enabled by default, the browser history and cache dutifully record search parameters entered into your favorite search engine.
While URL parameters and HTTP GET requests have many advantages for a company like Google (web analytics being one), they are not required to get the job done. The Duck Duck Go search engine already has an option to turn off referrer information and hide queries from browser history by using HTTP POST commands. Traffic to Duck Duck Go has more than doubled since the leaks were made public. Will Google or Yahoo be next?
Encrypted Cookie and Web Storage Data?
The Washington Post reported on the wholesale collection of cookie data in December 2013, and in January 2014, The Guardian reported on ‘leaky’ phone applications targeted to collect personal information. While the Guardian story focused on network data collection, private user information is also stored locally. Cookies (up to 4KB of data) and HTML5 Web Storage (up to 10MB per domain) are local storage constructs designed to be a scratchpad for websites. There are few restrictions on the content that can be stored; hence, some sites continue to use them to store sensitive data in plaintext. As developers rush to identify “leaks” in their applications, web storage will inevitably become more encrypted, encoded, and obfuscated.
Like every year, 2014 has plenty of surprises in store for digital forensics. With Windows 9, OS X 10.10, and Android 5.0 on the horizon I have no doubt that the number of new artifacts discovered will continue to outpace those that are lost. Happy hunting!