File Systems, Methodology, Research

Standard Units in Digital Forensics

by Dr Chris Hargreaves
Lecturer at the Centre for Forensic Computing at Cranfield University in Shrivenham, UK.

One of the earliest lectures in the MIT Openware programme in Physics begins with the lecture “Units and Dimensional Analysis”. Units of measurement are critical to science, so much so that there is a standard that defines science’s system of units, for example the precise definition of a kilogram — the SI (Système International d’Unités or International System of Units). The notion of units of measurement in science is extremely important and it therefore seems sensible to consider how this applies to digital forensics.As we will see, this does not necessarily suggest that there should be standard units of measurement in digital forensics, to report, for example, the position of the start of a file. As will be discussed later in the article, this is not always appropriate, since it is useful to describe such positions in different ways depending on the context. However, this article will discuss that reporting someunit of measurement is essential.Perhaps it is best to begin with a simple example:

“the text string ‘this is evidence’ was located at position 34556”

Since this important evidential artefact has been located, it seems sensible to check that the artefact is actually there. So, we should examine position 34556… but 34556 what? Bytes, sectors, blocks? Let us assume just for a second that the position is expressed in bytes, but what about the number base? If the position in which the string was identified was 86FC, it would be reasonable to assume that this is a hexadecimal offset. However, in this example we have 34556. This could be decimal or hexadecimal. So in order to precisely identify the position of this string, not only does the unit of measurement need to be expressed, but so too does the number base in which it is expressed.

Furthermore, consider the organisation of a disk. At sector 0 (LBA) we usually have an MBR. The data in this sector provides information about how the disk is partitioned. These partitions may contain file systems, and these file systems store files. Even assuming a position of 34556 bytes (in decimal), to determine the location of this identified string from the information provided it is necessary to guess about whether this is an offset from the beginning of the disk image, a logical offset from the start of one of the partitions, or perhaps it is an offset in one of the files. Offsets into files add further complexity, since the file may not be stored contiguously on disk and a linear offset in a file may actually involve jumping forwards and backwards in the disk image. However, the main point is that to precisely pin down the location in which this search string has been identified, it is necessary to report the unit of measurement (bytes), how that number is being expressed (as a decimal number) and the position that these bytes are measured from (from the start of the disk image). So, that concludes the simple example!

Another example where the detail associated with a value is crucial is dates and times:

“File 1 was created at 4 o’clock on the 3rd of February 2011”

The first and hopefully obvious question here is was it 4am or 4pm? Fortunately, almost all date interpretation tools will report times either with an am/pm suffix, or preferably in a 24 hour format, so it is rare to see times expressed this imprecisely. Further difficulties can arise when dates are expressed numerically, e.g. 04/02/03 can have different meanings in different parts of the world (is this 4th Feb or 2nd Apr?). To consider another date time example:

“The file was created at 16:00”

The crucial detail that is missing here is the time zone, or in other words, where was it 16:00 when this file was created? To resolve both the problem of ambiguity of days and months and also the time zone issues, it is possible to use a standard expression of time according to International Standard ISO 8601, which states that a date/time should be expressed in the following format:

2011-02-03 16:00:00+01

As a final example that perhaps digresses slightly from units of measurement, ambiguity can also arise in expressing the ‘organisational’ location of a file. For example:

“The file is located in D:\some_files”

That may very well be the case, and the fact that the file is on the volume mounted as D:\ may well be important, but remembering that the volume letters are a construct of the operating system, in an example where more than one operating system is installed (e.g. a Windows 7, Windows XP dual-boot system), the volume letter to which a volume is assigned (if indeed there is a letter) can be different in each installed OS. As a result, a common method of expressing the location of a file becomes ambiguous in certain circumstances, which is not an ideal situation to have in a piece of scientific writing.

The purpose of the article is not to promote pedantry, but simply to highlight that there are circumstances where reporting a value or providing a piece of data on it’s own is not necessarily sufficient, and that in order for a third-party to locate the same artefact, or understand a particular interpretation of some data, then more information needs to be provided with that value. There are likely to be many more examples of this than those covered in this article and perhaps through discussion we can share these examples, reduce ambiguity and ultimately improve the quality of writing and reporting in this field.

Click here to discuss this article.

References

International Organization for Standardisation (2004), ISO 8601 FAQ
http://www.iso.org/iso/support/faqs/faqs_widely_used_standards/widely_used_standards_other/date_and_time_format.htm

MIT Openware (2011), Units and Dimensional Analysis
http://ocw.mit.edu/courses/physics/8-01sc-physics-i-classical-mechanics-fall-2010/units-and-dimensional-analysis/

NIST (2008), The International System of Units
http://physics.nist.gov/Pubs/SP330/sp330.pdf

Read Chris’s previous columns

Chris Hargreaves is a lecturer at the Centre for Forensic Computing at Cranfield University in Shrivenham, UK. Chris is involved to some extent in all of the Centre’s core activities: Education, Research and Consultancy. Chris’s main focus is research (publication list available here), but he also teaches on several of the modules within Cranfield’s MSc programme including Advanced Forensics, the newly revamped Programming for Practitioners, and also some of the new courses planned for next year. Before taking on a lecturing position, Chris obtained his PhD at Cranfield on the topic of “Assessing the Reliability of Digital Evidence from Live Investigations involving Encryption”.

Discussion

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 999 other followers

%d bloggers like this: