E-Discovery, Forensics 101, Methodology

What are ‘gdocs’? Google Drive Data – part 2

Following up from the recent post on Google Drive, designed to give a high level introduction to the product, this post will delve a bit deeper into the technical issues relating to the data stored and also the best approach on how to access it.

The artefacts discussed in this post are based on Windows 7, however Apple Mac operating systems retain similar data in plists (property lists).

By default data from a user’s Google Drive is stored at C:\Users\USERNAME\Google Drive. In addition to this, there are nuggets of information and data stored on a user’s PC.

If we inspect the following location of the Windows registry we are able to learn a lot more about a particular Google Drive setup on a PC and we are also able to confirm the Google Drive product is indeed installed to that PC, by virtue of this key: 
KEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Installer\UserData\S-1-5-18\Products \227C12A7952F67947BAA66855EDFDEFA\InstallProperties

Within this key we can gather a range of information including when Google Drive was first installed, which is a simple date value in the format YYYYMMDD. In addition to this are version numbers and display names.

As you would expect, there is an entry at HKEY_CURRENT_USER\Software\Google\Drive, but there is little stored here.

Staying within the Windows registry by examining the ‘Run Key’ (HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run)  we can confirm if Google Drive is set to autorun on startup, which is the default.

The first registry entry I spoke of in this post
(HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Installer\UserData\S-1-5-18\Products\ 227C12A7952F67947BAA66855EDFDEFA\InstallProperties) contains a long string of characters (GUID) and does not actually mention or refer to Google. In my testing I have found the GUID of 227C12A7952F67947BAA66855EDFDEFA is consistent with all Google Drive installations on Windows, therefore searching for this GUID should identify the location of Google Drive data in the registry.

Stepping out of the registry, there is a great deal of data that can be found within the user profile of a user with Google Drive installed in addition to the Google Drive files themselves.

If the path C:\Users\USERNAME\ AppData\Local\Google\Drive exists a few SQLite databases and further settings files can be inspected.

First we have a file called ‘pid’. Inside this file is a number, which is the Windows process ID relating to the Google Drive application. However the really interesting data is within the SQLite database files here.

The smaller of the 2 databases is ‘sync_config.db’ and this amongst other information contains the registered Google Drive account/email address and the location of Google Drive files – which by default is C:\Users\USERNAME\Google Drive.

The larger database is ‘snapshot.db’ and contained within it are several tables holding very valuable information. Each file currently stored and not deleted from Google Drive has corresponding entries in the ‘snapshot.db’ database. These entries detail creation and modification dates in unix epoch format (number of seconds elapsed since midnight (UTC) on 1st January 1970).

The file names and the link to the files within Google Drive’s web store, which when accessed require the username and password to be provided. Other database entries include a file type, which is referenced by a number instead of the actual file type. There is also an MD5 hash value, which I believe Google use to check for differences in the data during a sync.

If data is deleted from a user’s Google Drive, some interesting things happen. A file deleted from the local Google Drive folder (C:\Users\USERNAME\Google Drive) makes no changes to the ‘snapshot.db’ content, until the Google Drive application is running. Therefore the entry for the deleted file will remain in the database until the Google Drive application is next enabled and synchronised at such time the entry is deleted from the database. Even if this has happened, the information is still potentially retrievable – I have had success recovering these deleted entries from unallocated space.

Deletion of data via the Google Drive web interface is different again. Much like a Windows operating system (or Mac and others for that matter), when you delete a file, it moves to the ‘Bin’ within the Google web interface and remains here until it is restored or further deleted and permanently removed.

As soon as a file is deleted and moved to the ‘Bin’ and a this action is synchronised  with the local installation of Google Drive and the entry for that file in the ‘snapshot.db’ is removed from the database.

Switching back to the web interface, despite the deleted file being in the ‘Bin’ users can still work on the file via the web interface and they can also restore it back as a live file. During such time and actions the revision history is not lost.

Once the file is restored and Google drive is synchronised with a local Google Drive client again the entry for that file is added back to the ‘snapshot.db’ complete with the original metadata and importantly the original creation date.

It is important to highlight that the metadata stored within the ‘snapshot.db’ is by far the most accurate and reliable.

In contrast, the metadata of the physical files stored in Google Drive accessed by a right-click and properties action is unreliable.

Take the scenario outlined previous where we have deleted a file and restored it back the creation date shown on the Windows properties will be the date the file was synchronised back from Google Drive. The ‘snapshot.db’ however shows the true creation date, which is when the file was first created before the actions of delete and restore.

We know that the presence of a Google Drive has the potential to assist and our eDiscovery and forensic work provided there is a solid understanding of how it operates and how the data can be captured and interrogated.

What are the practicalities and considerations when working with such data? In my first post I highlighted the need to understand Google Drive data and that native Google file types are little more than placeholders to file content stored within Google Drive servers.

What one also needs to appreciate is the importance of the structured databases generated by Google Drive. It is this structured data where we can recover a host of information about the files within Google Drive – the true metadata if you will.

In terms of approach – because Google Drive can be accessed via the Internet is it essential that in forensic and eDiscovery matters one considers both securing and isolating access to such data immediately. This should include removing network connectivity from computers and/or mobile devices to disable further synchronisations of the data. If this was not done and an individual deleted data from the Google web interface, these changes would occur on computers and/or mobile devices.

One should also revisit the point that an individual can in theory delete permanently data from Google Drive via the web interface. As a result that data must be secured quickly and where possible a legal hold put in place to prevent such data being deleted. You  cannot afford to wait or ignore this issue and should try whenever possible to collect data from Google Drive immediately (unless it is not considered to be within scope).

The presence of Google Drive placeholders on computers and/or mobile devices mean there is further work to do in terms of data capture and this work must be performed via the web interface. The Google Drive web interface allows users to download any file or a collection of files locally in a variety of different formats.

One should endeavour to download such data in as close to native format as possible for example a gdoc file as a docx file.

There are several issues to consider when downloading data direct from Google Drive’s web interface including:

  1. the question of gaining access with the username and password
  2. jurisdictional considerations,
  3. the potential for loss of metadata.

As a consequence best practice will be to capture and preserve both Google Drive files AND the Google drive structured databases so as to give as full a picture as possible.

Make sure you keep an eye on the Millnet blog for further updates.

Discussion

2 thoughts on “What are ‘gdocs’? Google Drive Data – part 2

  1. Thanks for the great article. There is a lot of great info here. I did want to tell you that the GUID you refer to in the installation key, KEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Installer\UserData\S-1-5-18\Products \227C12A7952F67947BAA66855EDFDEFA\InstallProperties, does not appear to be consistent across all installations. I installed Google Drive on my test system (Windows 7) and found the key at the same path with the exception of the GUID. I did a quick search for “google” to find the right GUID entry on my system and it came right up.

    Posted by Alfonso Salgado | May 23, 2013, 9:34 pm

Trackbacks/Pingbacks

  1. Pingback: [Feb 2013] F-INSIGHT Newsletter | F-INSIGHT - April 29, 2013

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 666 other followers

%d bloggers like this: