Nuix and EDRM republish cleansed Enron data set
22 May 2013
29 August 2013
28 February 2013
21 November 2013
10 October 2013
11 April 2013
LONDON, UK - May 15th, 2013 — Nuix, a worldwide provider of information management technologies, and EDRM, the leading standards organisation for the eDiscovery and information governance market, have today republished the EDRM Enron PST Data Set after cleansing it of private, health and personal financial information. Nuix and EDRM have also published the methodology Nuix’s
staff used to identify and remove more than 10,000 high-risk items at nuix.com/enron.
The EDRM Enron data set is an industry-standard collection of email data that the legal profession has used for many years for electronic discovery training and testing. It was sourced from the Federal Energy Regulatory Commission’s investigation into collapsed energy firm Enron. In early 2012, the EDRM Enron PST Data Set and the EDRM Enron Data Set v2 became an Amazon Web Services Public Data Set, making them a valuable public resource for researchers across a variety of disciplines
“Recently, we have been working closely with Nuix to cleanse the data set of private information about the company’s former employees and make the cleansed data set readily available to the community,” said George Socha and Tom Gelbmann, co-founders of EDRM. “These efforts help to protect the privacy of hundreds of individuals and we encourage anyone who finds private data that we did not remove to notify us.”
Using a series of investigative workflows on the EDRM Enron PST Data Set, Nuix consultants Matthew
Westwood-Hill and Ady Cassidy identified more than 10,000 items including:
- 60 items containing credit card numbers, including departmental contact lists that each contained hundreds of individual credit cards
· 572 containing Social Security or other national identity numbers — thousands of individuals’
identity numbers in total
· 292 containing individuals’ dates of birth
· 532 containing information of a highly personal nature such as medical or legal matters.
Many items contained multiple instances and types of information. This included departmental contact list spreadsheets with dates of birth, credit card numbers, Social Security numbers, home addresses and other private details of dozens of staff members.
The investigative team also clearly demonstrated that these items did not stay within the Enron firewall. For example, some staff emailed “convenience copies” of documents containing private data to their personal addresses.
“Nuix and our partners have conducted sweeps for private and credit card data for dozens of corporate customers and we are yet to encounter a data set that did not include some inappropriately stored personal, financial or health information,” said Eddie Sheehy, CEO of Nuix. “The increasing burden of privacy and data breach regulations, combined with the strict requirements of credit card companies, make this an unacceptable business risk.”
“Using the methodology we are publishing alongside the cleansed EDRM Enron data, organisations can identify private and financial data, find out if it has been emailed outside the firewall and take immediate steps to remediate the risks involved.”
Nuix is currently applying the same methodology to the EDRM Enron Data Set v2, which it will also republish at nuix.com/enron.
Nuix will host a Twitter chat to discuss the release of the cleansed EDRM Enron PST Data Set on Thursday, May 23rd 7:00pm BST. Nuix experts will describe the process of identifying unsecured financial, health and personally identifiable information in corporate data. Follow the hashtag #NuixChat and send in your questions beforehand to @nuix.
Nuix (www.nuix.com) is a worldwide provider of information management technologies, including eDiscovery, electronic investigation and information governance software. Nuix customers include the world’s leading advisory firms, litigation support providers, enterprises, government departments, law enforcement agencies, and all of the world’s major corporate regulatory bodies.
EDRM (www.edrm.net) creates practical resources to improve eDiscovery and Information Governance. Launched in May 2005, EDRM was created to address the lack of standards and guidelines in the eDiscovery market. EDRM published the Electronic Discovery Reference Model in January 2006, followed by additional resources such as IGRM, CARRM and the Talent Task Matrix. Since its launch,
EDRM has comprised more than 260 organisations, including 170 service and software providers, 63 law firms, three industry groups and 23 corporations involved with eDiscovery.
Mulberry Marketing Communications
+44 (0)20 7928 7676