File Systems

Interpretation of NTFS Timestamps

Introduction

File and directory timestamps are one of the resources forensic analysts use for determining when something happened, or in what particular order a sequence of events took place. As these timestamps usually are stored in some internal format, additional software is needed to interpret them and translate them into a format an analyst can easily understand. If there are any errors in this step, the result will clearly be less reliable than expected.

My primary purpose this article is to present a simple design of test data suitable for determining if there are errors or problems in how a particular tool performs these operations. I will also some present some test results from applying the tests to different tools.

For the moment, I am concerned only with NTFS file timestamps. NTFS is probably the most common source of timestamps that an analyst will have to deal with, so it is important to ensure that timestamp translation is correct.  Similar tests need to be created and performed for other timestamp formats.

Also, I am ignoring time zone adjustments and daylight savings time: the translation to be examined will cover Universal Coordinated Time (UTC) only.

Background Information

NTFS file timestamps, according to the documentation of the ‘FILETIME’ data structure in the Windows Software Development Toolkit, is a  “64-bit value representing the number of 100-nanosecond intervals since January 1, 1601 (UTC)”.

Conversion from this internal format to a format more suitable for human interpretation is performed  by the Windows system call FileTimeToSystemTime(), which extracts the year, month, day, hour, minutes, seconds and milliseconds from the timestamp data. On other platforms (e.g. Unix), or in software that is intentionally platform-independent (e.g. Perl or Java) other methods for translation is be  required.

The documentation of FileTimeToSystemTime(), as well as practical tests,  indicate that the FILETIME value to be translated must be 0x7FFFFFFFFFFFFFFF or less.  This corresponds to the time 30828-09-14 02:48:05.4775807.

File timestamps are usually determined by the system clock at the time some file activity was performed. It is, though, also possible to set file time stamps to arbitrary values.  On Vista and later, the system call SetFileInformationByHandle() can be used; on earlier versions of Windows, NtSetInfomationFile() may be used. No special user privileges are required.

These system calls have a similar limitation in that only timestamps less than or equal to 0x7fffffffffffffff will be set.  Additionally, the two timestamp values 0x0 and 0xffffffffffffffff are reserved to modify the operation of the system call in different ways.

The reverse function, SystemTimeToFileTime(), performs the opposite conversion: translating a time expressed as the year, month, day, hours, minutes, seconds, etc into the 64-bit file time stamp. In this case, however, the span of time is restricted to years less than or equal to 30827.

Requirements

 Before any serious testing is done, some kind of baseline requirements need to be established.

  1. Tests will be performed mainly by humans, not by computers. The number of test points in each case must not be so large as to overwhelm the tester. A maximum limit around 100 test points seems  reasonable.  Tests designed to be scored by computer would allow for more comprehensive tests, but would also need to be specially adapted to each tool being tested.
  2. The currently known time range (0x0 to 0x7FFFFFFFFFFFFFFF)  should be supported. If the translation method does not cover the entire range, it should report out-of-range times clearly and unambiguously.That is, there must be no risk for misinterpretation, either by the analyst or by readers of any tool-produced reports. A total absence of translation is not quite acceptable on its own — it requires special information or training to interpret, and the risk for misinterpretation appears fairly high. A single ‘?’  is better, but if there are multiple reasons why a ‘?’ may be used, additional details should be provided.
  3.  The translation of a timestamp must be accurate, within the limits of the chosen representation.We don’t want a timestamp translated into a string become a very different time when translated back again.  The largest difference we tolerate is related to the precision in the display format: if the translation doesn’t report time to a greater precision than a second, the tolerable error is half a second (assuming rounding to nearest second) or up to one second (assuming truncation)  If the precision is milliseconds, then the tolerable error is on the corresponding order.

TEST DESIGN 

Test 1: Coverage

 The first test is a simple coverage test: what period of time is covered by the translation? The baseline is taken to be the full period covered by the system call FileTimeToSystemTime(), i.e. from 1601-01-01 up to 30828-09-14.

The first subtest checks the coverage over the entire baseline. In order to do that, and also keep the number of point tests reasonably small, each millennium is represented by a file, named after the first year of the period, the timestamps of which are set to the extreme timestamps within that millennium. For example, the period 2000-2999 is tested (very roughly, admittedly) by a single file, called ‘02000’, with timestamps representing 2000-01-01 00:00:00.0000000 and 2999-12-31 23:59:59.9999999 as the two extreme values (Tmin and Tmax for the period being tested).

The second subtest makes the same type of test, only it checks each separate century in the period 1600 — 8000. (There is no particular reason for choosing 8000 as the ending year.)

The third subtest makes the same type of test, only it checks each separate year in the period 1601 — 2399. In these tests, Tmin and Tmax are the starting and ending times of each single year.

The fourth subtest examines the behaviour of the translation function at some selected cut-off points in greater detail.

These tests could easily be extended to cover the entire baseline time period, but this makes them less suitable for manual inspection: the number of points to be checked will become unmanageable for ‘manual’ testing.

Test 2: Leap Years

The translation must take leap days into account. This is a small test, though not unimportant.

The tests involve checking the 14-day period ‘around’ February 28th/29th for presence of leap day, as well as discontinuities.

Two leap year tests are provided: ‘simple’ leap years (2004 – year evenly divisible by 4), and ‘exceptional’ leap years (2000 – year even divisible by 400).

Four non-leap tests: three for ‘normal’ non-leap years (2001, 2002, 2003) and one ‘exceptional’ non-leap tear (1900 — year is divisible by 100).

More extensive tests can easily be created, but again the number of required tests would  surpass the limits of about 100 specified in the requirements.

It is not entirely clear if leap days always are/were inserted after February 28th in the UTC calendar: if they are/were inserted after February 23th, additional tests may be required for the case the time stamp translation includes the day of the week. Alternatively, such tests should only be performed in timezones for which this information is known.

Tests 3: Rounding

This group of tests examines how the translation software handles limited precision. For example, assume that we have a timestamp corresponding to the time 00:00:00.6, and that it is translated into textual form that does not provide sub-second precision.  How is the .6 second handled?  Is it chopped off (truncated), producing a time of ’00:00:00’?  Or is it rounded upwards to the nearest second: ’00:00:01’?

In the extreme case, the translated string may end up in another year (or even millennium) than the original timestamp. Consider the timestamp 1999-12-31 23:59:59.6: will the translation say ‘1999-12-31 23:59:59′ or will it say ‘2000-01-01 00:00:00’? This is not an error in and by itself, but an analyst who does not expect this behaviour may be confused by it.  If he works after an instruction to ‘look for files modified up the end of the year’, there is a small probability that files modified at the very turn of the year may be omitted because they are presented as belonging to the following year. If that is a real problem or not will depend on the actual investigation, and if and how such time limit effects are handled by the analyst.

These tests are split into four subgroups, testing rounding to minutes, seconds, milliseconds and microseconds, respectively.  For each group, two directories corresponding to the main unit are created, one for an even unit, the other for an odd unit. (The ‘rounding to minutes’ test use 2001-01-01 00:00 and 00:01. In each of these directories files are created for the full range of the test (0-60, in the case of minutes), and timestamped according to the Tmin/Tmax convention already mentioned.

If the translation rounds upwards, or round to nearest even or odd unit, this will be possible to identify from this test data. More complex rounding schemes may not be possible to identify.

Tests 4: Sorting

These tests are somewhat related to the rounding test, in that the test examines how the limited precision of a timestamp translation affects sorting a number of timestamps into ascending order.

For example, a translation scheme that only includes minutes but not seconds, and sorts these events by the translation string only will  not clearly produce a sorted order that follows the actual sequence of events.

Take the two file timestamps 00:00:01 (FILE1) and 00:00:31 (FILE2).  If the translation truncates timestamps to minutes, both times will be shown as ’00:00’.  If they are then sorted into ascending order by that string, the analyst cannot decide of FILE1 was timestamped before FILE2 or vice versa.  And if such a sorted list appears in a report, a reader may draw the wrong conclusions from it.

The tests are subdivided into sorting by seconds, milliseconds, microseconds and nanoseconds respectively. Each subtest provides 60, 100 or 10 files with timestamps arranged in four different sorting order. The name of these files have been arranged in an additional order to avoid the situation where files already sorted by file names are not rearranged by a sorting operation.  Finally, the files are created in random order.

The files are named on the following pattern: <nn>_C<nn>_A<nn>_W<nn>_M<nn>, e.g. ’01_C02_A07_W01_M66′.

Each letter indicates a timestamp field (C = created, A = last accessed, W = last written, M = last modified), with <nn> indicating the particular position in the sorted sequence that timestamp is expected to appear in. The initial <nn> adds a fifth sorting order (by name), which allows for the tester to ‘reset’ to a sorting order that is not related to timestamps.

Each timestamp differs only in the corresponding subunit: the files in the ‘sort by seconds’ have timestamps that have the same time, except for the second part, and the ‘sort by nanoseconds’ files differ only in the nanosecond information. (As the timestamp only accommodates 10 separate sub-microsecond values, only 10 files are provided for this test.)

The test consists in sorting each set of files by each of the timestamp fields: if sorting is done by the particular subunit (second, millisecond, etc.) the corresponding part of the file name will appear in sorted order.  Thus, an attempt to sort by creation time in ascending order should produce a sequence in which the C-sequence in the file name also appears in order: C00, C01, C02, … etc, and no other sequence should be the same ascending order.

An implementation with limited precision in the translated string, but that sorts according to the timestamp values will sort perfectly also when sorting by nanoseconds is tested.  If the sort is by the translated string, sorting will be perfect up to that smallest unit (typically seconds), and further attempts to sort by smaller units (milliseconds or microseconds) will not produce a correct order.

If an implementation that sorts by translated string also rounds timestamps, this will have additional effects on the sorting order.

Tests 5: Special tests

In this part, additional timestamps are provided for test.  Some of these cannot be created by the documented system calls, and need to be created by other methods.

0x00FFFFFFFFFFFFFF
0x01FFFFFFFFFFFFFF
0x03FFFFFFFFFFFFFF

0x7FFFFFFFFFFFFFFF

These timestamp can be set by the system calls, and may not have been tested by other test.

0x0000000000000000

This timestamp should translate to 1601-01-01 00:00:00.0000000, but it cannot be set by any of the system calls tested.

0x8000000000000000
0xFFFFFFFE00000000
0xFFFFFFFF00000000
0xFFFFFFFFFFFFFFFE
0xFFFFFFFFFFFFFFFF

These timestamps cannot be set by system call, and need to be edited by hand prior to testing.

These values test how the translation mechanism copes with timestamps that produce error messages from the FileTimeToSystemTime() call.

Other tests

TZ & DST — Time zone and daylight saving time adjustments are closely related to timestamp translation, but are notionally performed as a second step, once the UTC translation is finished. For that reason, no such tests are included here: until it is reasonably clear that UTC translation is done correctly, there seems little point in testing additional adjustments.

Leap seconds — The NTFS timestamp convention is based on UTC, but ignores leap seconds, which are included in UTC. For a very strict test that the translation mechanism does not take leap seconds into account, additional tests are required, probably on the same pattern as the tests for leap years, but at a resolution of seconds.

However, if leap seconds have been included in the translation mechanism, it should be visible in the coverage tests, where the dates from 1972 onwards would gradually drift out of synchronization (at the time of writing, 2013, the difference would be 25 seconds).

Day of week — No tests of day-of-week translation are included.

Additional Notes

 A Windows program that creates an NTFS structure corresponding to the tests described has been written, and used to create a NTFS image.  The Special tests directory in this image have been manually altered to contain the timestamps discussed. Both the source code and the image file is (or will very shortly be) available from SourceForge as part of the ‘CompForTest’ project.

It must be stressed that the tests described should not be used to ‘prove’ that some particular timestamp translation works as it should: all the test results can be used for is to show that it doesn’t work as expected.

TEST RESULTS

As the test image was being developed different tools for examination of NTFS timestamps were tried out. Some of the results (such as incomplete coverage) was used to create additional tests.

Below, some of the more interesting test results are described.

It should be noted that there may be additional problems that affect the testing process.  In one tool test (not included here), it was discovered that the tool occasionally did not report the last few files written to a directory. If this kind of problem is present also in other tools, tests results may be incomplete.

Notes on rounding and sorting have been added only if rounding has been detected, or if sorting is done by a different resolution than the translated timestamp.

Autopsy 3.0.4:

Timestamp range:

1970-01-01 00:00:01 —  2106-02-07 06:28:00
1970-01-01 00:00:00.0000000 is translated as ‘0000-00-00 00:00:00′

Timestamps outside the specified range are translated as if they were inside the range (e.g. timestamps for some periods in 1673, 1809, 1945, 2149, 2285, etc. are translated as times in 2013. This makes it difficult for an analyst to rely only on this version of Autopsy for accurate time translation.

In the screen dump below, note that the 1965-1969 timestamps are translated as if they were from 2032-2036.

Image

EnCase Forensic 6.19.6:

Timestamp range:

1970-01-01 13:00 — 2038-01-19 03:14:06
1970-01-01 00:00 — 12:00 are translated as ” (empty). The period 12:00 — 13:00 has not been investigated further.

Remaining timestamps outside the specified ranges are also translated as ” (empty).

The screen dump below show  the hours view of the cut-off date 1970-01-01 00:00.The file names indicate the offset from the baseline timestamps, HH+12 indicating an offset of +12 hours to 00:00. It is clear that from HH+13, translation appears to work as expected, but for the first 13 hours (00 — 12), no translation is provided, at least not for these test points.

Image

ProDiscover Basic 6.5.0.0:

Timestamp range:

1970-01-02 — 2038, 2107 — 2174, 2242 — 2310, 2378 — 2399 (all ranges examined)

 Timestamps prior to  1970-01-02, and sometime after 3000, are uniformly translated as 1970-01-01 00:00, making it impossible to determine actual time for these ranges.

Timestamps after 2038, and outside stated range are translated as ‘(unknown)’.

Translation truncates to minutes.

The following screen dump shows both the uniform translation of early timestamps as 1970-01-01, as well as the ‘(unknown)’ and the reappearance of translation in the 2300-period. (The directories have also been timestamped with the minimum and maximum times of the files placed in them.)

Image

WinHex 16.6 SR-4:

Timestamp range:

1601-01-01 00:00:01 — 2286-01-09 23:30:11.
1601:01:01 00:00:00.0000000 and .00000001 are translated as ” (blank).

Timestamps after 2286-01-09 23:30:11 are translated partly as ‘?’, partly as times in the specified range, the latter indicated in red. The cut-off time 30828-09-14 02:48:05 is translated as ” (blank).

Image

Additional Tests

Two additional tests on tools not intended primarily for forensic analysis were also performed: Windows Explorer GUI and PowerShell command line. Neither of these provide for additional time zone adjustment: their use will be governed by the current time configuration of the operating system. In the test below, the computer was reset to UTC time zone prior to testing.

 PowerShell

Timestamp range:

1601-01-01 00:00:00 —  9999-12-31 23:59:59

 Timestamps outside the range are translated as blank.

Sorting is by timestamp binary value.

The command line used for these examination was:

 Get-ChildItem path | Select-Object name,creationtime,lastwritetime

for each directory that was examined. Sorting was tested by using

 Get-ChildItem path | Select-Object name,creationtime,lastwritetime,lastaccesstime | Sort timefield

The image below shows sorting by LastWriteTime and nanoseconds (or more exactly tenths of microseconds).  Note that the Wnn specifications in the file names appear in the correct ascending order :

Image

Windows Explorer GUI:

Timestamp range:

1980-01-01 00:00:00 — 2107-12-31 23:59:57
2107-12-31 23:59:58 and :59 are shown as ” (blank)

  Remaining timestamps outside the range are translated as ” (blank) .

It must be noted that the timestamp range only refers to the times shown in the GUI list.  When the timestamp of an individual file is examined in the file property dialog (see below),  the coverage appears to be full range of years.

Additionally, the translation on at least one system appears to be off by a few seconds, as the end of the time range shows. Additional testing is required to say if this happens also on other Windows platforms.

Image

However, when the file ‘119 – SS+59′ is examined by the Properties dialog, the translation is as expected. (A little too late for correction I see that the date format here is in Swedish — I hope it’s clear anyway.)

Image

Interpretation of results

 In terms of coverage, none of the tools presented above is perfect: all are affected by some kind of restriction to the time period they translate correctly. The tools that comes off best are, in order of the time range they support:

PowerShell 1.0  (1601–9999)
Windows Explorer GUI (1980–2107)
EnCase 6.19 (1970–2038)

 Each of these restricts translations to a subset of the full range, and shows remaining timestamps as blank.  PowerShell additionally sorts by the full binary timestamp value, rather than the time string actually shown.

The Windows Explorer GUI also appears to suffer from an two-second error: the last second of a minute, as well as parts of the immediately preceding second are translated as being the following minute.  This affects the result, but as this is not a forensic tool it has been discounted.

The tools that come off worst are:

Autopsy 3.0.4
ProDiscover Basic 6.5.0.0
WinHex 16.6 SR-4

Each of these show unacceptably large errors between all or some file time stamps and their translation. ProDiscover comes off only slightly better in that timestamps up to 1970 are all translated as 1970-01-01, and so can be identified as suspicious, but at the other end of the spectrum, the translation error is still approximately the same as for Autopsy: translations are more than 25000 years out of register. WinHex suffers from similar problems: while it flags several ranges of timestamps as ‘?’, it still translates many timestamps totally wrong.

It should be noted that there are later releases of both Autopsy and ProDiscover Basic that have not been tested.

It should probably also be noted that additional tools have been tested, but that the results are not ‘more interesting’ that those presented here.

How to live with a non-perfect tool?

  1. Identify if and to what extent some particular forensic tool suffers from the limitations described above: does it have any documented or otherwise discoverable restrictions on the time period it can translate, and does it indicate out-of-range timestamps clearly and unambiguously, or does it translate more than one timestamp into the same date/time string?
  2. Evaluate to what extent any shortcomings can affect the result of an investigation, in general as well as in particular, and also to what extent already existing lab practices mitigate such problems.
  3. Devise and implement additional safeguards or mitigating actions in the case where investigations are significantly affected .

These steps could also be important to document in investigation reports.

In daily practice, the range of timestamps is likely to fall within the 1970–2038 range that most tools cover correctly — the remaining problem would be if any outside timestamps appeared in the material, and the extent to which they are recognized as such and handled correctly by the analyst.

The traditional advice, “always use two different tools” turns out to be less than useful here, unless we know the strengths and weaknesses of each of the tools.  If they happen to share the same timestamp range, we may not get significantly more trustworthy information from using both than we get from using only one.

A. Thulin
(anders@thulin.name)

Discussion

6 thoughts on “Interpretation of NTFS Timestamps

  1. This is all very interesting, academically speaking. However, as noted in the end of the article, the range of dates that a forensic analyst would expect to encounter are covered well by all the tools mentioned (except for ProBasic Discover which, if I understood correctly, doesn’t list seconds). And one could expect that future dates past 2038 will be handled better when we get up to that year (and by that time I’ll be retired).

    But it was interesting to read that software interpretation of dates isn’t always perfect. How would one manually interpret a timestamp on a crucial file?

    Posted by Michael | May 1, 2013, 8:31 am
    • In a way. But ‘expect to encounter’ is a statistical statement that makes best sense only for larger datasets. Is it somehow more acceptable that the analyst fails to do a correct analysis if that failure is due to an ‘unexpected’ timestamps? Or is it more OK if it happens for only five every hundred cases? Or perhaps only one out of hundred? And what is the projected damage in the cases where it does happen? What would a defence lawyer make of a case based on faulty timestamps? Will it make headlines?

      My personal opinion is that it is never OK to get this wrong. That’s why I created the test data — to see if I could discover tools that failed to do something as simple and straightforward as translating a legal timestamp correctly.

      With tools that do faulty translation, I’d do my best to call the problem to the attention of the toolmaker, and insist very strongly on a good and well-tested correction. Until that patch comes along, some alternative is required for all timestamps — I’d probably go for the PowerShell approach together with an image mounting tool. Alternatively, I’d program some DCode-like application that used the appropriate system calls in Windows, to make sure I was using the native translation routines.

      With tools that identify timestamps that are outside the supported range, it’s easier to identify problematical timestamps, but the solution is essentially the same.

      Posted by wpathulin | May 1, 2013, 10:05 am
      • First of all, let me say that I am not a forensics investigator (yet). I have many years of experience as a programmer and systems administrator, and lately I have been reading up on forensics with an eye towards getting into this interesting field.

        My point was, that as someone who has not actually worked on real forensic cases, it seems to me that the tools you mentioned all work just fine for the date range that one would encounter on real NTFS systems, namely 1993 (release of the Window NT 3.1) through the near future.

        Is it reasonable to expect a file to truly have an NTFS timestamp in the 1950’s? Or the 1600’s? Such a file cannot exist, since obviously there were no computers then (at least not PCs using NTFS).

        That’s why I think that investigating timestamps in those pre-PC or far-future dates is interesting but academic, I don’t see how it has practical applicability.

        To conclude, as you apparently have, that timestamps displayed in various software packages cannot be relied upon at all, seems to me to be a dangerous throw-out-the-baby-with-the-bathwater approach.

        As I wrote, my forensics experience is zero… So am I missing something here?

        Posted by Michael | May 1, 2013, 10:48 am
  2. You say that files dated in the 1950 cannot exist. It seems you assume that all timestamps are produced by the operating system on a system with correct time, in response to user activities, and that nothing else affects them.

    System time affects them. System time can be modified for many reasons — bypassing license restriction is one I have encountered.

    Timestamps may also come from archive files (ZIP, tar, etc.).

    The timestomp utility (or the later setMACE tool) allows just about anyone to reset timestamps, and just about any programmer can do that using his own code also. And if a person does so, one reason may be that the date is important for some reason: like april, 1850 or december, 1969 or 1964-06-02. In the right context, those dates may easily have a meaning that is more important than the bare date itself.

    Still, if the time has been reset for whatever reason, I still need to analyze it, and as the domain of NTFS timestamps range from 1601 to the 32000s, the translation should work over that domain. Windows can do that translation correctly — why shouldn’t computer-forensic tools do so as well? It is, quite literally, a fairly small matter of programming — it takes less than a day to write the relevant code, and approximately another day to test it comprehensively.

    I don’t think I made the conclusion you cite — I hope I have demonstrated that some computer forensic tools have shortcomings in this area, and noted that those shortcomings may be cause for concern for successful analysis. I also recommended that a risk/damage analysis probably is the best foundation for a decision what should be done about it. But if you think the damage is negligible to you, who am I to say you are wrong, for the cases you are or will be working?

    Good luck with your studies in computer forensics. When your text book or instructor gets into the subject of validating your tools and results, you are getting close to the specific area I have touched on in this article. Be prepared with your questions then.

    Posted by wpathulin | May 1, 2013, 11:56 am
    • I see your point.

      You’re right, I assumed that timestamps must always be produced by the OS, and that they should always reflect a date which is logical considering when NTFS has been in use. While I considered that people might change the date in order to bypass license restrictions or as an anti-forensic technique, I never considered that somebody would set a date like 1964-06-02 because that date has special meaning to them.

      Thanks for clarifying this.

      Posted by Michael | May 2, 2013, 10:43 am

Trackbacks/Pingbacks

  1. Pingback: [Apr 2013] F-INSIGHT Newsletter | F-INSIGHT - May 10, 2013

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 659 other followers

%d bloggers like this: