E-Discovery, Software

Collecting and Processing Bloomberg Data

A few years ago, Bloomberg data may have been relatively unusual, however today we see Bloomberg chat and email data being collected quite frequently. Not a surprise really considering some of the headlines relating to certain Banks and Financial institutions of late.

Below are some examples of the tips, tricks and considerations involved in working with Bloomberg data.

So, what is Bloomberg?

In simple terms, Bloomberg has integrated a “Bloomberg Community” communication system whereby licensed users can instant message and send email (in each instance potentially involving both text and attachments) to fellow Bloomberg users.

Typically the system is used outside of an ordinary email environment as users are able to communicate prices and trade information with the added ability to then extract all pricing information from their messages in a spreadsheet or other analysis format.

 

What is being collected?

 

For the purposes of a collection exercise and disclosure, the most common format of data is Bloomberg Instant Messages or Corporate email.

 

In addition, mobile SMS, Facebook and LinkedIn communications can also be exported.

 

Data Export Formats

 

Bloomberg data can be exported by either the client directly from Bloomberg (assuming they have the ability and permissions to do so and this approach is deemed to be reasonable considering the nature and issues in your dispute) or by Bloomberg on behalf of your client.

 

The way in which data has been exported from Bloomberg will very much depend on how we handle the data and best present the results for review.

 

The most common output formats for Bloomberg data is TXT or XML format.  It is common for clients to provide exports of Chat or Email files from Bloomberg in text file format (because XML exports often have to be performed by Bloomberg in the United States).

 

Bloomberg TXT files present certain issues (all can be overcome with the right knowledge and experience of the firm processing the data) and so, where possible, XML exports are preferred.

 

Issues to consider when handing Bloomberg TXT exports:

 

  • Dates – the date of a chat or email in TXT export format is fixed and does not have a time-zone offset.  As such, you need to know the time zone of the export so as to “normalise” dates in the same time zone as other data collected. This can be important – if your client is say in Hong Kong, and you are collecting data from servers in Hong Kong, and data out of Bloomberg.  You will want to ensure your dates are “normalised” so that they can be comparably sorted, searched and filtered within a wider data set.
  • Attachments – TXT exports may not export the attachments to chat or email.    Check the client export format to ensure that attachments are being captured and not missed.
  • Attachments if exported correctly, will normally have a UTC offset metadata field that will allow for automatic “normalisation” of dates.   This however presents a challenge in processing in that the emails will need to be formatted and handled separately to their attachments so as to ensure the metadata date fields delivered are accurate.
  • Bloomberg TXT emails will not de-dupe against Bloomberg XML emails unless a custom MD5# (a unique identifier) is built for each file.
  • Where possible, it is important to ensure the export format is consistent – to the extent this is not done it complicates the de-duplication process.

 

Bloomberg Chat comes with a few additional issues.   Be aware that Chat exports provided in TXT format produce more “user friendly” results:

 

Chat exports in TXT format will deliver a “conversation” as one document:

 

  • Doc 1 – Emma Kettleton entered the chat session, Emma Kettleton says Hi, Emma Kettleton leaves the chat etc

 

Chat exports provided in XML format will split each line of the conversation into the document:

 

  • Doc 1 – Emma Kettleton entered the chat session
  • Doc 2 – Emma Kettleton says Hi
  • Doc 3 – Emma Kettleton leaves the chat etc

 

Data Volume and Document Count

 

We have seen exceptionally large document counts resulting from Bloomberg Data collections.  For example, 0.5GB returning over 110,000 documents after de-dupe and another case where 10GB returned over 800,000 documents.   Bloomberg allows for pre-filtering of emails by date or other means and so, if possible, it is helpful for the data to be pre-date filtered before export to reduce the volume of material.

 

In summary

 

  • Consider what type of Bloomberg data your client has – Email and/or Chat
  • Consider the format they are exporting in TXT or XML there are pros and cons to each
  • How do you want to see the chat conversations (by chat entry or by conversation)
  • Ensure attachments are collected
  • This type of data can be poorly handled and result in wasted review time if attachments are required to be re-collected, or date fields are delivered incorrectly
  • Consult early, so you can be guided throughout the process.

More information is available here….

Discussion

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 710 other followers

%d bloggers like this: