Blog

The Basics of E-Discovery: Collection

Exterro Senior Solutions Consultant Nancy Patton explains data collection during the e-discovery process.

What Is Data Collection? How Does It Differ from Preservation?

In e-discovery, data collection is the process of extracting information from its original source and copying it into a separate repository for use in later stages such as processing, review, and analysis.

This differs from preservation. Preservation ensures that data remains in its original location and is not altered or deleted. Collection, on the other hand, involves actively copying electronically stored information (ESI) into a new environment where it can be worked on.

What Types of Data Do Legal Teams Need to Collect?

Legal teams must collect a wide range of data types, including:

  • Emails
  • Word documents
  • Files stored on backup tapes
  • Archived data

In most cases, email is the most commonly collected data type—particularly messages created within the past few years, typically aligning with the timeframe of the litigation.

What Is Metadata? How Does It Relate to Collection?

Metadata is “data about data.” It includes background details such as:

  • Author
  • Creation date
  • Sender and recipient
  • File path or email domains

Metadata provides critical context that helps explain the meaning and relevance of a document.

During collection, preserving metadata is essential. If metadata is altered, corrupted, or lost, it can obscure important facts and make it more difficult to understand the full story behind the data.

How Do You Collect Data for E-Discovery?

There are several methods for collecting data:

  • Using internal or external teams to physically gather data from systems
  • Allowing custodians (employees) to self-collect data
  • Using technology to remotely collect data

Each method has its own advantages and tradeoffs, depending on the situation and available resources.

What Are Data Collection Best Practices?

  • Avoid over-collection: Only collect what is necessary to reduce costs and complexity
  • Use a tiered approach: Start with the most relevant data sources and expand collection only if needed

A strategic, phased approach helps ensure efficiency while minimizing unnecessary data handling.

If you’d like to learn more about e-discovery data collection, consider exploring comprehensive guides on the topic.