The Future of Search in E-Discovery
Searching an enterprise for documents in response to an e-discovery request has devolved into something of an arms race with an aggressively growing diaspora of electronically stored information (ESI) on one side and constantly evolving search and retrieval technologies on the other. Many would argue that the ESI explosion – whether “regular" data or “Big Data" – has outpaced developments on the technology front. Part of the problem lies in the rapid rate of innovation in enterprise infrastructure. Addressing the introduction of email and Microsoft Office documents in the legal system (seemingly way back in the dinosaur age) was a primary driver for the rise of e-discovery technology and service providers. Following that evolved more dynamic collaboration platforms, such as SharePoint and Lotus Quickplace, which even today still present challenges. And those may constitute mere appetizers compared to the ESI entrée that is mobile devices, social media and cloud-based sources, whose corporate adoption has been nothing short of viral. Given these waves of change, the question becomes: Has the race already been lost?
From a vendor's perspective, the answer is, of course, “NO." Advancements in e-discovery search technologies and methodologies seen in the last year alone have been staggering, especially with respect to emerging predictive technologies. But addressing the growth of enterprise ESI takes more than technological innovation; it requires a fresh look at the entire search process.
Here are four trends in the e-discovery search arms race that hold promise for e-discovery practitioners:
1. Widespread Adoption of In-Place Search Capabilities
Most organizations still collect (often by imaging hard drives) large volumes of ESI before searching it for relevancy. This approach can be extremely costly based on the sheer amount of ESI and data sources that a typical e-discovery request now implicates. Moreover, the “collect everything" mentality delays and, in some cases, prohibits the identification of truly relevant documents that will ultimately guide case strategy early on in a matter, greatly diminishing the value of what is known as early case assessment (ECA). Technology has advanced to support “in-place" ECA, which allows legal teams to search and analyze ESI as it exists in the information ecosystem. Once the ESI is indexed enterprise wide, e-discovery teams can then quickly identify key documents, analyze them for relevancy and greatly minimize the number of documents that will ultimately need to be collected or reviewed.
2. Continued Maturity of Federated Search Technologies
Federated search is a hot topic right now in e-discovery. In the most basic sense, federated search is the process by which multiple heterogeneous data repositories can be searched simultaneously from a single query. As vendors use different definitions for federated search, the methodologies behind the technologies can vary greatly. A major challenge, as described in a recent blog post by eDJ analyst Greg Buckles, “occurs when this 'one search fits all' approach is accepted without understanding how a search criteria is interpreted by each different (repository) engine and applied to potentially different (index) fields." However, Buckles notes that the federated search strategies and the technologies that support them are improving and that current limitations shouldn't stop people from embracing their long-term promise.
3. Application of Predictive Algorithms on Native Data Volumes
Predictive analytics, in which a software model can predict which documents are likely responsive to a document request, offers one of the most exciting advancements in search methodologies. Introduced as predictive coding a few years ago, the conversation has shifted from whether this approach is reliable to how it can best be utilized not only during review but also much earlier in the process (a topic that dominated the discussion at Legal Tech earlier this year). Advancements in predictive technologies are now enabling legal teams to evaluate native data volumes before collection, effectively narrowing the ESI “funnel," resulting in smaller, more precise document sets requiring attorney review. Fully utilized, predictive analytics can help legal teams make critical case decisions much quicker and greatly reduce e-discovery costs.
4. Greater Utilization of Statistical Sampling to Validate Results
Lawyers often joke that they went into law to avoid math, but advancement in e-discovery means that math has finally caught up with them. As search technologies advance and machines grow more indispensable, effective use of statistical sampling can help validate proffered search techniques. Testing and comparing human-reviewed and machine-assisted coding can confirm whether the system has produced results at least as consistent as those gained through manual attorney review. Attorney Ralph Losey is among a growing number of e-discovery experts promoting the value of statistical sampling in e-discovery contexts. In a recent blog post, he predicted that the use of sampling by lawyers will increase rapidly over the next ten years. “Random sampling is too powerful a tool for the profession to ignore. It has been well proven as an indispensable tool of science and industry. It is probably time for law to also embrace this tool," he wrote.
To learn more about emerging search methodologies and uses of predictive technologies watch Exterro's recent webcast Practical Predictive Intelligence for Proactive E-Discovery.