As data volumes increase and become more complex, having an integrated e-discovery environment where systems and data sources automatically sync information and exchange data with e-discovery applications has become even more critical for enterprises. This is true of unstructured and semi-structured data sources, such as email servers and content management systems, as well as structured data sources, like databases and data archives. Structured data exists in discrete pieces (typically called "fields") that are preserved in a large archive or database. All entries have a consistent format (or structure) and organized into rows and columns. Contact lists or customer records are two common types of structured data; fields like name, phone number, address, city, state, and zip code combine to form a single record in the larger database.
Most e-discovery practitioners are used to working with unstructured data sources like email, file shares, or documents in SharePoint, but often they are unfamiliar with the technology and terminology of databases, extracts, report generation, and archive, which makes them unsure about the best ways to preserve or collect from these sources. If the application is an old one, this fear often becomes a mandate to keep everything just as it is, which translates to mothballed applications just sitting there in case data might be needed down the road. Beyond the costs of maintaining those systems, there's also the issue that if there is IT staff turnover, it's increasingly hard to generate the reports Legal and Compliance need from these old systems.
If you are keeping around mothballed applications and databases purely for reporting purposes, these are prime targets to migrate to a structured data archive. Cost savings from licenses, CPU, and storage can equal up to 65% per year, with the added benefit that it's much easier to enforce a retention policy on this data, roll it off when it expires, and compliance reporting is easier to do with modern tools.
One huge challenge that comes from these legacy applications stems from the fact that there are typically a lot of them. That means that when a discovery request arises, someone – or more likely multiple people – have to go to each one of those applications one by one to search for and retrieve relevant data. Not only is that time consuming and cumbersome, but it also assumes that there are people with the skill sets and application knowledge necessary to interact with all of those different applications.
In any given company, that might not be a problem today, shortly after the applications have been decommissioned, because all the people that used the applications when they were live are still around. But will that still be the case 5, 7, 10 or 20 years from now? Probably not. But it is important to preserve the data, as in many cases it is relevant and responsive to potential litigation.
Retiring all of these legacy applications into a “platform neutral" format is a much more sustainable, not to mention cost effective, approach. By integrating e-discovery -- legal holds and collections -- with your structured data archive, you can make it a lot easier to coordinate preservation and collection activities across the two systems. This reduces the chances of stranded holds -- data under preservation that could have been released, and reduces the ambiguity about what needs to happen to the data to support the needs of legal and compliance teams.
The first and perhaps most important step is recognizing that the solutions for unstructured and semi-structured data are simply incapable of handling structured data. Without something that is purpose-built for structured data, your discovery preservation and collection process is going to ignore this entire category of data. The good news is that some of the solutions that are purpose built for structured data have built in integrations to the leading e-discovery platforms.