
In e-discovery, data collection is the process of extracting information from its original source and copying it into a separate repository for use in later stages such as processing, review, and analysis.
This differs from preservation. Preservation ensures that data remains in its original location and is not altered or deleted. Collection, on the other hand, involves actively copying electronically stored information (ESI) into a new environment where it can be worked on.
Legal teams must collect a wide range of data types, including:
In most cases, email is the most commonly collected data type—particularly messages created within the past few years, typically aligning with the timeframe of the litigation.
Metadata is “data about data.” It includes background details such as:
Metadata provides critical context that helps explain the meaning and relevance of a document.
During collection, preserving metadata is essential. If metadata is altered, corrupted, or lost, it can obscure important facts and make it more difficult to understand the full story behind the data.
There are several methods for collecting data:
Each method has its own advantages and tradeoffs, depending on the situation and available resources.
A strategic, phased approach helps ensure efficiency while minimizing unnecessary data handling.
If you’d like to learn more about e-discovery data collection, consider exploring comprehensive guides on the topic.