Experts are fond of citing the four Vs when describing Big Data: volume, velocity, veracity and variety. These terms have come to describe the explosive growth of electronically stored information (ESI) in recent years, and the statistics that back them up are simply staggering. According to IBM, more than the 90% of all the digital data in the world was created in the last two years. While the four 4 Vs help to quantify Big Data, their implication may best be described by two additional Vs: value and vulnerability.
Big Data was the focus of a recent Exterro webcast Big Data Converging with Legal, Information Governance and Regulatory Requirements. E-Discovery and information governance experts Adam Cohen, Mark Melodia and Adam Wells examined some of the drivers of Big Data and explained how organizations can mitigate the many risks the surround it.
Understanding the “Value" of Big Data
We have largely come to accept that we live in world of data and digital information. The average person creates multiple data trails every day, through emails, text messages and the ever-ubiquitous online forms that must be filled out for routine activities like online banking and e-commerce. As webcast speaker Adam Cohen, a principal for the Fraud Investigations and Dispute Services division of Ernst & Young, explained, advances in analytics technology are transforming how information can be extracted from that constant stream of data. “The point is to uncover useful information, hidden patterns, unknown correlations, and this enables some meaningful analysis that may be unavailable by conventional business intelligence means," he explained. These technologies and new methods for extracting information, he added, “are based on the realization that time to information is critical if you are going to extract value from all of these data sources."
As the E-Discovery Beat addressed in our reporting from LegalTech New York, social media sites like Facebook offer a glimpse into the power of Big Data analytics. Facebook has hundreds of millions of users, many of whom pump data into the system on a daily (in some cases hourly) basis. But when users login into Facebook, they aren't exactly met with data chaos. Rather, information is presented to them in a very structured way, based on preferences, location, timelines, connections and a host of other factors. These elements give meaning to Big Data and also contribute to why social media sites are considered optimal vehicles for online advertising because content can be individualized to each specific user.
Adam Wells, vice president of business development and eDiscovery services at TERIS, relayed the example of how the retailer Target developed a sophisticated, and controversial, system for tracking purchasing patterns and customer behavior to create highly tailored marketing campaigns. According to Wells, the strategy was so effective that Target correctly pegged one 16-year-old customer as pregnant before even her parents found out.
As organizations find ways to better harness data, as the Target and Facebook examples show, it makes sense that they will only want to collect more of it. That's where the two other Vs come in as more data brings into play a number of information governance and e-discovery considerations.
Recognizing and Addressing Big Data 'Vulnerabilities'
As described above, through advanced analytics, big data and the information it communicates is more of an asset to organizations than ever before. But those assets can quickly become liabilities. “There is more volume, more variety, things changing so fast and that means that there is more risk because the opportunity for missteps has expanded dramatically and the technology is new," said Cohen. Beyond security and privacy concerns, which were certainly raised in the Target example, traditional e-discovery and information governance processes and policies are dramatically being reshaped by Big Data. “We're going to have to rely more on technology because it's just too difficult to make decisions on individual pieces of information in isolation," said Cohen, who referenced the emergence of predictive coding and other predictive technologies as an example of the kind of tools that are becoming increasingly necessary to sift through large volumes of data.
Compounding those challenges, organizations today face a daunting regulatory environment. Mark Melodia, a partner at Reed Smith, joked that the regulatory environment is so dynamic that it's difficult to create a slide for a webcast that's not out of date by the time it's typed up. He cited recent pronouncements from the FTC and new requirements surrounding HIIPA and patient privacy in the healthcare industry (the topic of an upcoming Exterro webcast). Beyond U.S. regulations, large organizations operate on a global scale and must also consider the specific laws of individual countries or regions.
“When you go across borders it gets all the more complex….the kind of digitized information we're talking about is inherently not national in scope," Melodia said. “Our privacy laws, and to some extent our security laws, look at things as if national borders matter to data." He added that the increasing popularity of the Cloud further complicates cross-border issues, since it's not always clear what specific laws apply to cloud-based data.
E-Discovery Best Practices for Dealing with the Big Data Challenge
At the conclusion of the webcast, the speakers provided some best practices for handling information governance and e-discovery in a Big Data environment. There suggestions included:
- Automating the Legal Hold Process: One of the challenges of Big Data is making sure that all potentially relevant ESI is accounted for and preserved when litigation arises. E-Discovery requests today implicate too much data and too many custodians to be managed through emails and spreadsheets. Automated systems help track responses and facilitate follow up reminders and escalations, which eliminate the manual work that leads to errors and oversights.
- Utilizing Predictive Technologies: One of the central e-discovery challenges of Big Data is extracting relevant information out of increasingly voluminous data sets. Any lawyer knows that relevant ESI must be found quickly. Manually sifting through millions of documents is overly time consuming and expensive. Predictive technologies offer tremendous promise not only as a means for quickly identifying relevant information but also in an information governance context, to help proactively manage and categorize ESI at its source.
- Integrating Systems: Data handoffs between systems are the minefields of e-discovery. Extracting large volumes of data from one system and dumping it into another isn't only tedious, it's also very risky and can easily lead to lost or altered ESI. Having an integrated e-discovery platform that connects with common data sources and other e-discovery and information governance tools reduces the inherent risks of data transfer and provides centralized visibility into the process as a whole.
To watch the full webcast “Big Data Converging with Legal, Information Governance and Regulatory Requirement," click here.