Blog

3 Key Considerations and 3 Best Practices for Data Collection from the Experts

Experts from the UK and EU shared vital insights into how the data collection landscape has shifted. Whether for criminal forensics, civil e-discovery, or privacy requests (DSARs), the approach must be strategic and technically sound.

Just before the holidays, Exterro hosted a webinar titled Critical Data Collection Methodologies and Use Cases, featuring data collection experts from the United Kingdom and EU: Sarah Hargreaves, VP of Global Training at Exterro; Carlo Vreugde, Data Protection Manager for the government of the Netherlands; and Dr. Cedric Krummes, Information Governance Officer at HS2 Limited.

Without further ado, let’s dig into three key considerations that inform the best practices mentioned later in this article.

Key Consideration #1: Understand the Purpose of Your Data Collection

The "Why" of your collection dictates the "How." Criminal forensic data collection is potentially very different from more targeted, e-discovery-based, or privacy-based collections. This distinction informs how much data you collect, the technical challenges you face, and the technology you’ll use.

  • Criminal/Forensic: Requires a deep-dive, bit-by-bit disk imaging to ensure even encrypted or deleted data is captured for evidence.
  • Government/Citizen Data: In a government setting, you might be collecting data on citizens. This involves complex privacy concerns because of interlinked data types—from administrative and medical to financial records.
  • Privacy/DSAR: A Data Subject Access Request (DSAR) has very defined boundaries. The goal isn't a deep dive, but operational efficiency to retrieve specific data within regulatory deadlines.

Key Consideration #2: Be Aware of the Limitations of Search

You can only search for what you know you need to find. Standard keyword searches can be limited. To truly understand data, you need software capable of identifying:

  • Data Relationships: Linking emails with their specific attachments.
  • Content Clustering: Finding contextual data that is similar or linked to other pieces of information.
  • Custodian Mapping: Understanding the relationships between individuals who are custodians, owners, or subjects of data.

Key Consideration #3: Understand the Landscape of Data Sources

The shift to remote work has dramatically increased the risk of data leakage and "rogue devices." Personal phones, tablets, or laptops may contain sensitive business information that data collectors aren't fully aware of.

  • Security: Mandate multi-factor authentication (MFA) to reduce the risk of sensitive data loss.
  • Policy: Update policies to explicitly define appropriate use of personal devices for work and when it is acceptable to use work devices for personal tasks.

Five Best Practices for Data Collection

With those key considerations in mind, here are five best practices you should implement in your data collections:

Best Practice #1: Know Your DataUnderstand your data environment. Maintain an active data map so you know what types of data you possess and where they are located. Ensure you have the right tools to collect from cloud, on-premise, and mobile sources.

Best Practice #2: Document Your Processes and ProceduresEnsure everyone in your organization understands their responsibilities regarding data security. Documented procedures create a "playbook" that provides a defensible audit trail if a breach or incident occurs.

Best Practice #3: Have a SIEM System or SOCUtilize a Security Information and Event Management (SIEM) system or a Security Operations Center (SOC) to analyze security event logs routinely. This allows you to identify anomalies and mistakes before they become full-scale breaches.

Best Practice #4: Use a Phased Collection ApproachDon't collect everything at once. Start with the most relevant sources and custodians to reduce "data noise" and the associated costs of processing and review.

Best Practice #5: Validate Your ResultsAlways verify the integrity of the collected data against the original source. This ensures that no data or metadata was lost or corrupted during the transfer process, maintaining its defensibility in court.

Conclusion

Data collection is no longer just a technical IT task; it is a pillar of Legal GRC. By understanding the specific purpose of an investigation and documenting your steps, you can ensure your processes are both efficient and legally defensible.

If you want to learn more about this topic, attend the on-demand webinar: Critical Data Collection Methodologies and Use Cases.

As we move further into 2026, are you finding that the rise of "ephemeral" messaging (like Signal or disappearing Teams chats) is making your Best Practice #1—knowing your data—significantly harder to achieve?