20 September 2007
Once upon a time documents were in a paper file in someone’s filing cabinet, and limited primarily to letters, faxes and meeting notes.
But now they are more likely to be in a number of electronic locations – mobile phones, MP3 players, BlackBerrys, laptops, thumb drives, servers, back-up tapes – and in significantly larger numbers. Indeed, estimates suggest that 100 billion emails are created every day.
A recent high-profile probe by the Competition Commission called for emails to be provided to it by Asda and Tesco relating to their relationships with suppliers during a particular period. Asda pointed out that around 11 million emails were sent to its suppliers in the period in question. Which technology should the supermarkets and their lawyers use to help cope with this volume of material?
Traditionally, disclosure lawyers have been thought to be able to make up to 250 document decisions per day. On that basis, 11 million emails would take 44,000 lawyer days to review. With a team of 10 junior lawyers or paralegals, that would equate to 4,400 days, or around 12 years (working weekends and taking no holiday). Surely, therefore, the Commission’s request is unreasonable?
To put this in context, a typical disclosure case might start with the collection of a terabyte of data. That would equate to around 60 million emails (or a stack of paper around 16 times the height of the Empire State Building).
Yet lawyers may not need to look at every document in every case. They may want to look for a particular category of document first.
Looking at the technology available, there are three key stages of the process where the approach taken and the tools used can have a considerable impact on the efficiency with which the lawyers are able to handle the data.
Collection – digital evidence recovery
Rather than rush in to going out and imaging every laptop and server, it may be possible to isolate a limited shared server area and extract from that, or to use tools to search through back-up tapes before extracting from them.
The aim here is for the lawyers’ technical teams to minimise the quantity of irrelevant data that is collected – that can then vastly reduce the processing and lawyer-review time.
Processing – preparation and culling
Typically a data-collection exercise would involve more than just emails. System and operating files, for example, are likely to make up a significant percentage of that collection. Once these files have been stripped out, there may be a much more manageable number of reviewable email files.
In most cases, some basic culling of the data can be carried out to reduce its size. This might include date-range or keyword searching, a focus on key individuals’ and companies’ emails, and deleting duplicate emails. There are various tools that will perform these tasks.
Lawyers’ technical teams faced with large quantity of emails should also consider carrying out a ‘near’ de-duplication exercise. Whereas traditional de-duplication will only remove exact duplicates, near de-duplication will remove excess email chains so that the reviewer only sees the final email and the chain of emails behind it.
Starting with 11 million emails might seem daunting, but it could relatively quickly be culled to around 2 million for review.
Many lawyers and their technical teams will be familiar with traditional disclosure management tools such as Introspect, Concordance, Ringtail and Discovery Radar. In general these are ‘linear’ review tools allowing secure online access, keyword searching capabilities, tagging, and redaction and production capabilities.
However, other types of tools need to be considered, particularly when faced with large quantities of data for review. The review of 2 million emails, for example, might benefit from the use of a concept-mapping tool (such as Attenex), with clustering and visualisation capabilities that have been shown to provide considerable increases in review productivity.
The decision on which review tool should be used may also be affected by whether foreign languages are involved (Attenex and Aungate offer language capabilities, some others do not).
Alternatively, in a situation where the lawyers are very clear on what they are looking for, they might run some keyword searches (there are various tools that do this) and then provide the data as a load file which lawyers might then just want to search through in Outlook, for example.
Using these techniques, lawyers could hope to turn the review time for a large exercise such as this from years into months, or perhaps an even shorter time period than that.
In a regulatory situation, lawyers might consider entering into a dialogue with the regulator to agree the appropriate technical methodology. In a litigious situation that dialogue may be with both the court and the other side.
Alex Dunstan-Lee, forensic legal specialist, KPMG Forensic