Methodology, Network Forensics

How to Create an Open Source Network Forensics Appliance

By Ondrej Krehel
Chief information security officer at Identity Theft 911

IntroductionEncryption and anti-forensics attacker techniques are commonly encountered in incident response investigations, while the power of network forensics intelligence is often overlooked by busy IT and legal departments. Compromised networks only occasionally capture network incident data sets for further analysis, but when they do it can be a boon to forensic investigators. Network forensics not only provide additional evidence, but in many cases that evidence has stood as primary evidence. Network sessions can be analyzed, reconstructed and replayed. Any transmitted files can be analyzed, even if they don’t exist on the compromised system. Tools that compromise systems but don’t leave traces on hard disks can often be seen in action; this is the kind of evidence we’re seeing more and more often in courtrooms.

My focus in this article will be on currently available open source Network Forensic Analysis Tools (NFAT) that could be used to build a network forensic appliance. You always need to read tool licenses before use, but most are under GNU General Public License. This article does not address legal or privacy concerns, which should be fully considered before implementing an open source network forensic appliance.

Note: Before we explore software and hardware options, it is important to design your network forensics platform in such a way that other examiner actions are logged—and the integrity of the recorded data is not compromised. This could be done by securing logging settings on Windows or Unix systems, restricting access through secure connections, hardening the operating system and enabling firewall filtering. Not many open source applications come with log features, so ensuring that the underlying system is secured and actions are logged is crucial.

Protocol Decoding

The majority of open source software is designed to work on Unix-based operating systems, such as Linux or FreeBSD. Only a few items we’ll discuss are available in Windows environments. Network forensic software is usually composed of different modules that record, possibly filter, decode and analyze the data. Open source tools often can’t deal with large data sets, so filtering the noise and narrowing down the focus is vital.

Capturing traffic seems to be a theoretically simple task, but practically speaking that isn’t necessarily so. Open source tools such as tcpdump, snoop, snort, Wireshark, windump, and kismet for wireless are good starting options for capturing network traffic.

The amount of captured data can be substantial. Recorders allow filtering of the network traffic from different vantage points, which can be very useful, since memory is always faster than hard disk drives. Many recorders require a pcap library for the capturing process and can be very well used in shell or batch scripts. During the capturing process, network cards are usually set in the promiscuous mode, which allows the card to receive packets not destined to its physical address or MAC address. Recorded data can be analyzed by reading acquired data sets and applying filters. Filters could be related to the type of protocol, geolocations based on IP range, protocol properties such as TCP flags, or source and destination hosts involved in network communication.

The majority of the sessions seen in the recorded data will be based on TCP, and there are a range of tools available for analysis and decoding of those protocols. In order to analyze the different level headers and payload of the recorded packets, we need to understand how protocols are designed and decode them into a human readable form. Each application protocol has its own specifications, so each might need its own specific decoder. Within each decoded protocol is an OSI layer, which can provide information on a specific layer, continuing to the highest layer.

Since the majority of a user’s protocols in network traffic are based on TCP, most open source projects focus on TCP. TCPtrace provides a list of TCP sessions and their properties (elapsed time, bytes, window advertisement, retransmissions), and it can take input files produced by several popular packet-capture programs, including those listed above.

Meaningful Data Extraction

Gluing or splitting pcap files can be done with another open source program, TCPslice. TCPslice can extract packets based on the timestamps and statistical summary of a tcpdump file, which can be obtained by TCPdstat. The output, among other properties, contains the unique source and destination address pairs, number of packets and breakdown of protocols. This could, for example, quickly uncover an unusually large volume of ICMP packets, which can be a sign of a DoS or DDos attack. Chaosreader is a very comprehensive analysis program that supports a variety of protocols, including wireless protocols, which can help you understand the information at hand. It analyzes ssh traffic and creates keystroke delay data files. Honeysnap, hosted on the Honeynet website, provides analysis reports that identify significant events in recorded data files. Unfortunately, it only supports a short list of protocols.

When it comes to encrypted traffic, SSLdump can analyze and decrypt SSL encrypted recorded data if an SSL/TLS certificate is provided. NetworkMiner is one of the few tools that run on the Windows platform (Wireshark and tcpdump run as well), and in captured dataset can separate graphical content, transferred files, transmitted passwords and usernames, and provide an actual view of different OSI layers. We’re also seeing more of an effort by commercial vendors to provide complimentary tools, such as NetWitness Investigator.

Email, Webmail and VOIP

If forensic investigators choose to focus on email and webmail activity reconstruction, DataEcho is a good open source choice. Both EtherPEG and Driftnet can be used for the simple recording and extracting of graphical content from unencrypted network traffic. Driftnet also has a component in development that picks out MPEG audio streams and tries to play them. With open source package tcpxtract, files can be carved from pcap files as they are carved from unallocated hard disk space based on their header and footer.

One comprehensive package is Xplico, which can extract email (POP, IMAP and SMTP protocols), all HTTP contents, each VOIP call (SIP), FTP, TFTP and so on.

Many investigations focus on patterns in network traffic. Ngrep allows investigators to specify patterns in extended regular or hexadecimal expressions, which can be compared with captured data set payloads. Ngrep also utilizes filtering techniques in the same fashion as common packet capturing tools, such as tcpdump and windump. This can be very useful in the analysis of malicious traffic and collection of unencrypted authentication credentials.

VOIP is gaining popularity, and it is important to understand that voice files can be recorded among other network data, reconstructed and analyzed. VOIPong is a utility that detects all VOIP traffic in the session, and if the unencrypted stream is G711 encoded, then the data is converted to a vaw file and can be played. VOIPong supports all major protocols, including SIP, H323, Cisco’s Skinny Client Protocol, RTP and RTCP. Another project, vomit, analyses and decodes phone conversations from Cisco phone files recorded in tcpdump files. Reconstruction and analysis of VOIP traffic can also be done in Wireshark or its text variant Tshark.

UCSniff combines VOIP and video reconstruction features and also supports many VOIP voice compression codecs.

Hardware Prerequisites

Hardware requirements for recordable platforms vary by the type of connection and by what you want to record, how long you want to record it and in what format. If the networking switch, network card, operating system or hard disk drives are unable to process the recorded data, packets are dropped. Yet these operating system kernel, network card and switch packet losses can be documented. In a typical setup, the recorder is usually attached to a Switch Port Analyzer (SPAN)—sometimes called port mirroring or port monitoring, which is a port on the switch that mirrors all the traffic. Traffic is often mirrored on a monitoring SPAN VLAN—the monitoring ports a member of that VLAN. Physical network Test Access Ports (TAPs) are quite common as well, and usually they separate received and transmitted full duplex traffic. A recorder can put them back together, which is done by the creation of a virtual interface and combining received and transmitted packets. A common technique is called channel bonding, which merges two connections into one aggregate interface that sees both halves of the traffic at the same time.

Recorded data sets can be quite large, depending on the monitored connection. In the case of a T1 connection, we have a maximum transit of 1,554 Mb/s, which is multiplied by 60 seconds per minute, 60 minutes per hour and 24 hours per day, giving us around 133 Gigabits of data—or almost 17 Gigabytes in one day’s time. A variety of storage options are available, such as IDE RAID, SATA or SCSI RAID, yet the most preferable solution is an inexpensive SATA II RAID 5—though RAID 10 is being used more and more often. If we mirror ports on the switch, total throughput is basically just an aggregation of the traffic volume on all ports. Depending on average utilization, it is easy to calculate how much space is needed. It’s important not to overflow the mirroring port with packets and increase the chances of losing data. Monitoring of the mirroring port is vital.

Based on tests by Sandstorm, FreeBSD is the best platform from a performance point of view, and the Windows NT-based system is the worst. Linux remains a very good choice and provides a lot of flexibility. More information related to Sandstorm testing can be found here or at the Niksun portal. Hardware configurations are also covered.


Full-content network monitoring is a powerful tool for analyzing events and their correlations. With other sources of evidence, the reconstruction of events and the extraction of binary files from recorded traffic can be invaluable in the event of a computer attack. Currently available software and hardware configurations provide affordable options ready for the task. Yet even today, most system administrators haven’t clued in to the benefit of deploying these technologies and continue to underestimate, or simply overlook, how they operate—thereby undermining their deployment benefits.

Ondrej Krehel serves as chief information security officer for Identity Theft 911, the nation’s premier identity theft and data breach management, resolution and education service. With more than a decade of experience in cyber security and computer forensics, he has launched investigations internationally and domestically into a broad range of IT security matters—from hacker attacks to data breaches to intellectual property theft. His work has received attention from CNN, Reuters, The Wall Street Journal and The New York Times.


No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: