Analyzing Exchange and mbox e-mail files using Free and Open Source Software

First published December 2005

Mike Harrington, CFCE EnCE
linuxchimp@gmail.com
Innovative Digital Forensic Solutions, L.L.C.

Mark Lachniet, CISA CISSP
mlachniet@analysts.com
Analysts International

Table of Contents

1.Document Overview
2.LIBPST/LIBDBX
3.Locating Exchange .dbx/.pst Files
3.2 Locating files in the filesystem
3.2.1 Deleted Files
3.2.2 Allocated Files
3.3 Exporting from Exchange
4.Converting .dbx/.pst files
5.Viewing decoded .dbx/.pst files with Thunderbird
6.Converting to HTML with MHONARC
7.Bonus Ideas
7.1Converting Eudora e-mail
7.2Converting UNIX e-mail
7.3Importing mbox into other e-mail clients
7.4Using uudeview to extract attachments
7.5Carving for .eml and using eml2mbox for conversion
8.Summary


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.


Unsubscribe any time. We respect your privacy - read our privacy policy.

1. Document Overview

E-mail is everywhere and the digital forensic examiner is often faced with the task of searching e-mail for evidence of wrongdoing. This paper attempts to outline a simple methodology for using free and open source based tools for converting Microsoft Outlook or Outlook Express files into a flat mbox format that can be then manually imported into the Mozilla Thunderbird e-mail client for viewing, or manipulated using other useful scripts. This document is really just a primer for basic e-mail analysis, and is intended to be a living document. If you have any questions, comments or suggestions (including sections that you think should be added!) please contact the authors directly

The paper will be divided into several sections. Section two details installing Libpst and Libdbx to convert the outlook and Outlook Express files. Section three deals with finding the .dbx or .pst e-mail files. Section four details converting the found .dbx or .pst files into the flat mbox format using the readdbx or readpst tools that were compiled in Section One. The fifth section covers how to import these converted files into the Mozilla Thunderbird e-mail client for viewing. The sixth section will discuss how to parse mbox files into threaded HTML documents and extract attachments for easy searching and manipulation. The last section will discuss other useful tools and tricks that could be of use to the examiner.

Throughout this paper the examples we will be using are based on my forensic laptop that is an AMD64 machine running Gentoo an x86_64 2.6.12 kernel 1. The examples should work exactly the same for x86 based machines or other UNIX-type systems in general.

2. LIBPST/LIBDBX

The readbx and readpst executables are created from the Libdbx and Libpst source code respectively. You can find the source for both at the following site.

http://sourceforge.net/project/showfiles.php?group_id=18756&release_id=117314

(Of course, using Gentoo one only needs to use the commands ’emerge libdbx’ or ’emerge libpst’…;-)

Once you’ve downloaded the source to a download location of your choice (in this case I’ve downloaded the source to ‘/usr/local/forensicapps’) you need to untar and unzip the archives.

chimp forensicapps# tar xvzf libdbx_1.0.3.tgz
chimp forensicapps# tar xvzf libpst_0.3.4.tgz

Then change into the directory for libdbx.

chimp forensicapps# cd libdbx_1.0.3
chimp libdbx_1.0.3# make

You should now have a file called readbx in this directory. Make sure its executable by issuing the following command

chimp libdbx_1.0.3# chmod +x readbx

Now move the executable to a directory in your path usch as /usr/local/bin.

chimp libdbx_1.0.3# mv ./readbx /usr/local/bin

Repeat the following steps for untaring/zipping and compiling readpst. You will then have file named readpst that you can then make executable by the same method described above. Also move this into a directory in your path.

That’s it! You can now move onto the next section which details the .dbx and .pst files that you want to convert.

3. Locating Exchange .dbx/.pst Files

The next required step is to find the e-mail files (mailboxes) that you want to analyze. To do this, you can either find a copy on the client workstation, or export them from an Exchange server.

3.2 Locating files in the filesystem

3.2.1 Deleted Files

First of all, you should determine whether or not there may be copies of e-mail DBX and PST files in the deleted and slack portions of the file system. You may wish to use an automated forensic program such as SMART (http://www.asrdata.com/tools/) to see if it is possible to recover any older, deleted files. SMART can also be used to extract pure unallocated data for you to concentrate on exclusively. Remember deleted files may contain the “smoking gun” you are looking for!

The way we’ll cover here is by using the Foremost carving tool (http://foremost.sourceforge.net). Since we cant assume that everyone is using a distro with a decent package management tool (Gentoo anyone?) lets grab the source and compile it ourselves (remember to check the md5sum of the download).

Now with the source downloaded let’s extract it (I’ve downloaded the source to my temp directory.

chimp temp# tar xvzf foremost-069.tar.gz
chimp temp# cd foremost-069
chimp temp# cat README | less
chimp temp# make && make install

This will extract the gzipped tar archive and then reading the README file will tell you about how to compile and install (‘make && make install’). One thing to note is that the foremost.conf file that contains the header and footer information for the file types you want to carve needs to be in the directory you run foremost from.

Take a peek inside the foremost.conf file to see how its formatted and what types of files are already supported. For our purposes simply open up foremost.conf in a text editor and uncomment (erase the ‘#’ that begin before a line) the .dbx (or .mbx,.pst) line. They are located in the Microsoft Office section.

chimp bin# nano -w foremost.conf

Now with that done you need to run foremost over your image files. Foremost requires an empty directory to dump files it finds. It also keeps an audit of the files it finds and the offset in the image file where they were found.

What if you have multiple image segments? No worries mate! One of the cool things that foremost can do is create output directories on the fly…so let’s just write a script to take care of our multiple segments.

First make the initial output directory (you could script this as well..;-))

chimp evid# mkdir carvdbx

Now the script:

#!/bin/bash
x=0
# the above sets a counter
for i in /your/image/dir/
#This loops through your segments
do
foremost -v $i -o /your/output/dir$x
#this carves with verbose output turned on and outputs to your dir
x = ‘expr $x + 1’
#this increments the value of ‘x’ by one
done

With the files carved proceed on…

3.2.2 Allocated Files

The most common location for .dbx files to be located is in the following path (on a Windows XP box).

C:\Documents and Settings\\Local Settings\Application Data\Identities\{GUID}\Microsoft\Outlook Express

Common .dbx files you might see in this location might include Inbox.dbx, Sent Items.dbx and Drafts.dbx. There might be others as well. Simply copy these files out to a directory on your mounted forensic drive (in my example my suspect NTFS partition is mounted read only at ‘/mnt/win’).

chimp ~# cp /mnt/win/”Documents and Settings”/$USER/”Local Settings”/”Application Data”/Identities/{GUID}/Microsoft/’Outlook Express”/*.dbx /mnt/evidence/e-mail/dbx/

If you want to make sure your not missing any .dbx files you can use the find command to locate the .dbx files and copy them over to your forensic directory.

chimp ~# find /mnt/win -type f -name “*.dbx” -print -exec cp ‘{}’ /mnt/evidence/e-mail/dbx \;

Passing the ‘-print’ parameter to the find command gives you a nice output of what is being found and copied over. Omit this to suppress the output.

The procedure for finding .pst file is exactly the same. The default location on a Windows XP box for .pst file is in the following path.

C:\Documents and Settings\$USER\Local Settings\Application Data\Microsoft\Outlook\

Got all that? Good. Now we can progress onto the next section where we detail how to convert our newly found files into a flat mbox format that will be easily imported into the Thunderbird e-mail client.

3.3 Exporting from Exchange

In the event that you don’t have access to a user’s workstation, but do have administrator access to the Exchange server, you may be able to export a user’s data to a PST file using the ExMerge program. To download this file, refer to:

http://www.microsoft.com/downloads/details.aspx?displaylang=en&familyid=429163ec-dcdf-47dc-96da-1c12d67327d5

According to the documentation contained in this download, “You can use the program to extract data from one or more Exchange mailboxes into .pst files”. You may wish to run this program if you have to recover some very old data, perhaps as part of a legal discovery process. For example, if all that exists within an organization are backup tapes, you may have to build up a server, restore from tape, and then use the ExMerge program to extract that user’s old e-mail spool to a PST file for analysis.

4. Converting .dbx/.pst files

Ok so you’ve found your files and copied them over to the forensic directory of your choice. It’s now time to convert those bad boys into a flat mbox format that will be easily imported into the Mozillla Thunderbird e-mail client or parsed with handy tools.

First change into the directory you copied the files into.

chimp ~# cd /mnt/evidence/e-mail/dbx

Now make a directory or your decoded .dbx files.

chimp dbx# mkdir ../decoded

After doing this its time to convert the files into our mbox format. We accomplish this by doing a little for loop in our /mnt/evidence/e-mail/dbx directory.

chimp dbx# for X in *.dbx; do /$pathto/readdbx -f “$X” -o /$pathof/forensic/directory/”$X.$$”; done

Make sure to put the path to your evidence and forensic directory in the above. The ‘.$$’ appends the process number of the command to the file(not strictly needed but I put it there to identify the decoded files). Now you should have the decoded files in your forensic directory. If you received some errors for readdbx or readpst decoding the files check to see if the decoded files are empty files. Double check that the original files are empty as well.

The procedure for decoding .pst files is similar to the above. The only real change we need to make is to put the output file option before the .pst file, as is shown below.

chimp dbx# for X in *.pst; do /$pathto/readpst -o /$pathto/forensicdir/”$X.$$” “$X”; done

Sweet! Now we are all decoded and ready to move onto other tools.

5. Viewing decoded .dbx/.pst files with Thunderbird

Okay, you successfully decoded the .dbx/.pst files that you are interested in viewing and now you want to do just that view the files…so how do we do that? Read on my friend…

This section assumes that you have Mozilla Thunderbird (my e-mail client of choice) installed on your system. It is beyond the scope of this paper to help you install Thunderbird for your particular system but it should be incredibly easy. You should be able to import these decoded files into the e-mail client of your choice (in fact I tested this out for Evolution and it works and obviously the mail client in Mozilla is the same a Thunderbird).

A little side note a good habit to get into is reinstalling a fresh copy of the OS of your forensic machine for every case you work. This assures that you have no cross contamination of evidence. At the very least a fresh install of your e-mail client.

To view the decoded mail files in Thunderbird we need to do a little prep work. Fire up Thunderbird and create a new email account that is going to be used to track your suspect mail.

Enter in a bogus SMTP and POP server etc and name the account in a way that will make it easy for you to organize; something like…”Suspect Mail”. It is also important to uncheck the “Use Global Inbox” and the “Download Messages Now” options.

The account name should show up in Thunderbird with default compliment of sub-folders underneath it.

Then simply copy the decoded file into your new Thunderbird “Inbox” directory.

chimp ~# cp -v /$pathto/decoded/files/inbox /$pathto/ new thunderbird/mail/inbox

Now fire up Thunderbird and the files you want to view should appear as a “folder” where you copied them. If converted file was non-empty the folder you copied it into should have one or more e-mails contained within them.

Something I have found helpful in organizing my converted and imported suspect mail is to go into the Thunderbird directory and make directories that will delineate it as the suspect’s Inbox, Deleted, etc. mail.

If you are using Evolution you need to select “File” from the menu and then import. From there select auto for the import format and where you want to import the file.

6. Converting to HTML with MHONARC

Once you have your mbox format file, you may want to archive them in an easily searchable format, or strip off attachments in one fell swoop. One handy way of doing this is to use the MHONARC program from: http://www.mhonarc.org/ This is also a very handy way to archive your *own* old e-mail so you can get your hands on old addresses, attachments, etc. without clogging up your e-mail client with gigabytes of data. Just remember to backup your mbox format files every time you upgrade a server or something and you should be fine. I personally have years worth of my own mbox files backed up this way, and its very handy.

Download and install the package, and read the internal instructions. In particular, you may choose to write a script to do all the conversion and so on. My script looks like the following:

#!/bin/sh -f
#
./MHonArc-2.6.10/mhonarc yourfile.mbx -add -attachmentdir /path/to/attachments \ -folrefs -idxfname index.html -main -multipg -outdir /path/to/htmlemail -reverse

This script will open up your file ‘yourfile.mbx’ which is your mbox formatted file, and then copy all the attachments to /path/to/attachments and all the e-mails themselves in a threaded format to /path/to/htmlemail.

At this point, you can open up either the threaded or date-sorted HTML index files, or you can grep for interesting information using a command such as

chimp dbx# grep badstuff /path/to/htmle-mail/*

to find all e-mails with the word ‘badstuff’ in them. You should be aware of case sensitivity for your particular grep program, and obviously also consider the types of keywords that are likely to match such as p0rn, pr0n, etc. Finding a simple e-mail address, for example to cut out conversations with a particular person, is a piece of cake.

7. Bonus Ideas

Here are some bonus ideas and tools. Suggest some more!

7.1 Converting Eudora e-mail

There is a nice script to handle Eudora Mail. It is available at the following site http://www.xs4all.nl/~maryniak/eudora2unix/.

7.2 Converting UNIX e-mail

Hey, you say, I’ve got UNIX e-mail, how do I analyze it? Well, luck for you its already in mbox format, so you don’t have to do anything at all. Just look for mail spool files. These are sometimes stored in directories such as /var/spool/mail, /var/mail, etc. You’ll also frequently find mbox format spools in temporary directories. For example, if you have a bunch of e-mail that couldn’t get delivered (perhaps you were an open mail relay and wanted to see what kind of Viagra or whatnot you were relaying) you may find mbox format files awaiting delivery in /var/spool/mqueue or a similar directory.

7.3 Importing mbox into other e-mail clients

Say you have an mbox file, and you want to import it into a different e-mail client than Thunderbird and this program doesn’t allow importing, but does work as a POP3 client. Fortunately, you can easily do this as long as you have a UNIX mail server to do it with. All you need to do is make an account on the server, copy the mbox file over that user’s e-mail file, usually in /var/mail or /var/spool/mail and then use a POP3 client to download the mail. All the email will be downloaded to your client as if it were brand new.

7.4 Using uudeview to extract attachments

Say you are a glutton for punishment, and you really really want to extract attachments from the ASCII MIME-encoded text in your mbox file. You can do this. First just cut the e-mail out using a word processor, starting from the first “From:” field, and ending before the next one, and save it as plain text. Then download uudeview.exe from: http://www.fpx.de/fp/Software/UUDeview/. Then simply run the program on the text file – it will find the mime-encoded sections, convert them to binary and dump them on the filesystem. You’ll want to suggest this option if the opposing attorney wants to verify your work, since it is very easy to explain, and makes them work a lot harder. This is a very handy way to cut naughty pictures out of e-mail so you can insert them into your report.

7.5 Carving for .eml and using eml2mbox for conversion

Using the techniques described above (and after figuring out the header/footer) you could have foremost carve out .eml files (an extension used by some email clients-including Outlook Express-for mail)and use the eml2mbox.rb program available at this site http://www.broobles.com/eml2mbox/ to convert them to mbox format.This program needs the Ruby interpreter to be on your system. This should be installed on many linux distributions by default and easily obtainable on others (remember emerge?).

The website has all the documentation on how to run the script.

8. Summary

This article showed you various ways to convert mail files to mbox format and parse them using free and open source tools. The article should cover the most common forms of Windows based e-mail clients encountered by the forensic examiner, but is only a basic primer. Beyond the scope of this article is web-based e-mail and more advanced types of e-mail such as Novell Groupwise. We hope the article was informative and helpful to you in your forensic endeavors.

The authors welcome all comments and suggestions.

1. It should be noted that some programs will not cross-compile correctly in the pure AMD64 Gentoo environment-notably readpst. If the program is compiled in a 32bit chroot environment-or on an x86 machine- and the proper emulation libraries are installed for the AMD64 box – the binaries will function properly. Obviously the discussion of 32bit chroot environments is beyond the scope of this article.

Leave a Comment

Latest Videos

Digital Forensics News Round Up, March 27 2024 #dfir #digitalforensics

Forensic Focus 27th March 2024 6:06 pm

Digital Forensics News Round-Up, March 21 2024 #digitalforensics #dfir

Forensic Focus 21st March 2024 6:15 pm

This error message is only visible to WordPress admins

Important: No API Key Entered.

Many features are not available without adding an API Key. Please go to the YouTube Feeds settings page to add an API key after following these instructions.

Latest Articles