The Basics of E-Discovery , Chapter 2: Information Governance


Few would argue the premise that successful companies depend on the creation and consumption of electronic data. But all that content and information introduces significant cost and risk into an organization. Because of the explosive growth in the amount of data that companies today create, collect, and store, they are spending millions of dollars in unnecessary storage and management costs, while also exposing themselves to the increasing risk of security, privacy, compliance and legal violations. It's against this backdrop that the concept of information governance (IG) has risen to prominence in recent years.

Find out about the relationship between e-discovery and information governance in this chapter intro video.

What is information governance?

You'll come across various definitions of information governance (IG), but a useful one comes from the analyst firm Gartner:

“The specification of decision rights and an accountability framework to ensure appropriate behavior in the valuation, creation, storage, use, archiving and deletion of information. It includes the processes, roles and policies, standards and metrics that ensure the effective and efficient use of information in enabling an organization to achieve its goals.”

We know there is a lot to digest in that definition. We'll try to break it into more edible portions. Here are some important points to know about IG:


Comprehensive Strategy

IG builds on the principles of traditional records management and enterprise content management. Those disciplines focus on the information lifecycle, how data is created, stored, organized, and ultimately deleted. IG encompasses a comprehensive strategy that helps organizations get the most value out of their digital information assets while minimizing potential risks. In that sense, IG is designed to guide records management and enterprise content management, not replace them, while also providing a framework for making smarter decisions with regards to what information is valuable and where it is best stored within the IT infrastructure.


Essential to Business Process

IG impacts many areas of the organization. Virtually every business process creates and relies on information to function effectively, so a company's IG strategy affects the entire organization, not just the IT and records departments.



Creating an IG strategy is a cross-disciplinary activity that relies on a multitude of stakeholders across the organization including IT, compliance, records management, legal, security/privacy, and business unit leaders. The Information Governance Reference Model (IGRM) shown below provides a helpful graphical representation of the cross-functional nature of IG.



When properly implemented, an IG strategy will allow you to "separate the wheat from the chaff." That is, you will be able to focus on gaining insight and value from the content that actually has business value (estimated to be only about 30% of content in your network) while ignoring or deleting the redundant, obsolete, or trivial (ROT) data that is clogging your servers.

Information Governance Reference Model (IGRM)

Why Information Governance?

If you haven't noticed, we create a lot of data. The International Data Corporation (IDC) estimates that the digital universe doubles every couple years. Make no mistake, this explosion of data creates opportunities for companies. All this digital information, when properly harnessed, drives faster, smarter decision making, delivers better insights into consumer behavior, and promotes greater efficiency. But from a legal and risk standpoint, Big Data can also be difficult to manage. Studies vary by industry, but it's safe to say most companies never analyze the majority of the data they save. A recent Deloitte article estimates that 90% of data is unstructured, untagged, and untapped "dark data."

In the days before Big Data, decisions on how and where data should be kept were mainly made by the individual employees who created and used the data, and this is still a prevalent model. However, many companies have come to realize they need much more structure and centralized oversight into how data is stored, managed, and retained to protect against the inadvertent disclosure or hacking of sensitive information, such as personally identifiable information (PII), personal health information (PHI) payment card information (PCI), or other confidential data. Records management and content management programs are designed to perform this important function, but they tend to be applied reactively and have struggled to keep pace with the data deluge. And of course, there is always legal risk of information that could have been defensibly deleted hurting your case in future litigation if it is found to be responsive.

IG has emerged to fill this void by looking at information management much more comprehensively and proactively. If you revisit Gartner's definition above, you will see that IG is should connect the processes, roles, policies, standards, and metrics that traditionally have been managed as separate programs.

Though the price of data storage has fallen precipitously in recent years, that decline is far outpaced by the rate at which content is being created. Thus, data storage remains a significant cost drain for companies. That being said, a lot of people are hesitant to delete data, so IG can't be based solely on cost reduction. The more compelling reasons for adopting an IG program are the ability to gain greater insights from data that has business value and risk reduction. Significant data breaches can almost always be traced back to poor governance and create terrible publicity, damaging brands, and diminishing customer confidence. Examples from recent years abound, including Target, Equifax, Yahoo, Uber, and Sony.

Information governance unifies processes.

How does IG relate to E-Discovery?

You might be wondering why have IG in a guide about e-discovery. In fact, e-discovery and IG are very interrelated after all; IG is listed as the first stage of the Electronic Discovery Reference Model (EDRM).

Understanding the legal and regulatory obligations with which the enterprise must comply is essential to effective IG. To support e-discovery requests, IT managers and legal counsel must work together to deliver reliable and timely access to potentially relevant ESI. It is often difficult to meet court-mandated deadlines, because data is stored on numerous sources (e.g., file servers, content management systems, desktops, laptops, mobile devices, archives, and other storage assets), each with specific access requirements. Identifying which IT assets are most relevant to a case is the first step; systematically suspending disposition policies, preserving the ESI, and proactively de-duplicating, indexing, and searching the data is where the true importance of IG comes into focus. See our data mapping description below.

Other elements of IG that directly impact e-discovery activities include:


Retention Schedules

How long is information retained?


Security Controls

How is information protected from both internal and external threats?


Privacy and Regulatory Considerations

How is sensitive or legally protected information handled?


Access Rights

Who has access to what information and when?

An effective IG strategy supports greater e-discovery efficiency by reducing the amount of discoverable ESI that is stored across the organization and improving how information is organized, so legal teams can get to what they need much quicker.

IG Challenges

Unlike the other sections of this guide that look at individual e-discovery processes, IG is a framework that isn't defined by specific stages or workflows. It serves to support and guide other processes and programs across the organization by establishing a common set of rules, policies, and procedures. This all begs the question: what does IG ultimately look like? There isn't an easy answer to that question. Many companies have started the IG conversation but have yet to move from theory to practice. And that jump isn't an easy one. Creating an IG program is full of many significant challenges. Here are three common hurdles.

Three Common IG Hurdles


Project Ownership

IG doesn't fit neatly into an existing corporate silo. As such, it's not always clear who is supposed to champion the cause. Like cars simultaneously stopped at a four way intersection, the end result is a lot of yielding among key stakeholders with no one really stepping up to get the wheels in motion. Some companies have even created the position of Chief Information Governance Officer (CIGO) as a way of dealing with the ambiguity surrounding IG ownership, a trend that figures to gain even more steam in the years ahead. While it may not be the case that every organization needs a CIGO, it is becoming clear that IG absolutely needs an owner.


Conflicting Priorities

When you get the key stakeholders in a room to discuss IG you'll notice that there are a lot of competing interests. Business leaders' top priority is access and efficiency (there is a reason why products like Sharepoint and Dropbox are so popular). A company's IT team has to think about data security and protecting certain information, and this usually translates to a more restrictive approach to how data is created and stored throughout the enterprise. Meanwhile, corporate attorneys are consumed by risk mitigation. In practice, that means knowing where data resides and exacting some level of control over how certain media are used (i.e., no contract negotiations over instant messenger). If you transposed these priorities onto a Venn diagram, you would see very little in the way of overlap, which underscores the challenge organizations face in developing IG programs while highlighting why they are so badly needed.


Executive Backing

Like any large, complex endeavor, IG requires investment, and investment requires executive sponsorship. IG champions have to construct a business case for why IG is necessary. That means quantifying how unmanaged data truly impacts the company bottom line in order to present a compelling return on investment (ROI) case for IG.

It's easy to think about IG in terms of technology, but before tools are applied companies must develop an overall IG strategy. Once an organization defines its IG strategy, it needs to create a program that defines key stakeholders, policies, procedures, and measures. Of course, technology will be essential in supporting all of these elements (see the tools section below).

Here is a checklist of some helpful IG best practices to get your program off the ground:

IG Best Practices

Create a cross-functional IG team

No IG plan will be successful if it doesn't reflect the needs and goals of all key stakeholders, including legal, compliance, risk management, HR, IT, data privacy, information security and the business units. Each of these entities should be represented in the initial IG planning phase and have a say in defining success criteria, key measurables, and potential risks.

Conduct a comprehensive data audit

You cannot develop a strong information governance framework without first knowing what you have. Each business unit will likely be familiar with the main data sources that they interact with, but for IG everything must be accounted, including backup tapes, legacy/retired systems, and data archives that are likely not being actively managed at all.

Carefully assess and define legal and regulatory requirements

When getting an IG program off the ground, it's critically important to understand any and all external retention requirements. Figure out what data has to be kept and for how long, and revisit those requirements often to ensure that the IG plan stays up-to-date.

Prioritize IG activities

IG plans aren't created overnight, so it's critical to address the most pressing issues first. The data assessment described above should drive the prioritization of initial IG activities. For example, if the audit reveals a preponderance of decentralized data stored randomly across a bevy of shared drives, a good first step might be to create a data map that connects employees to the data sources with which they most frequently interact. Likewise, if the initial assessment reveals a large accumulation of unnecessary backup tapes, the IG plan should place greater emphasis on developing and executing a better defensible deletion policy, ridding the company of data it no longer needs. Every organization has its own unique set of challenges. Prioritize IG activities based on the specific data environment.

Train Employees

While select members of the organization will define the IG program, its ultimate success rests on the larger workforce actually executing on the plan. Training is essential. Employees have to know how the individual policies and procedures impact their day-to-day activities. It's also important to communicate why the program matters and what it's designed to accomplish so that employees better appreciate the need to make changes.

Develop enforcement criteria and follow through

Training is crucial, but achieving 100 percent employee compliance without a strong enforcement structure is unrealistic. The best way to enforce the IG program is by conducting random, periodic audits of employee compliance and working with business unit leaders to establish firm corrective measures for non-compliance.

Measure results

How will you demonstrate the effectiveness of the IG program? What are the key metrics to track? These considerations should be part of the IG planning phase. Project champions should develop targets and desired outcomes, and define how those will be measured early in the process.

Information Governance Tools

There are a variety of overlapping technologies that fall under the IG umbrella, far too many to cover in this guide. When it comes to IG technology, the discussion shouldn't just focus on what new tools are needed but also consider better utilization of existing technology investments and the interplay between systems. For example, many organizations have deployed journaling or archiving systems for managing corporate email, but they have not gone the next step to index this content for rapid retrieval. These systems can be huge productivity aids when automating e-discovery activities, such as preservation and data collection. Taking the time to think through how archiving and indexing of ESI can aid in the identification of content owners, editors, and readers; creation and use dates; and search criteria relevance is a major component of a comprehensive IG program.

Here are five other IG tools that specifically support e-discovery activities:


Data Mapping

Data mapping software is designed to help you create, update, and organize a complete directory of your data environment. Data maps are notoriously difficult to build. Specialized tools, like Exterro's E-Discovery Data Mapping application, are designed to support collaborations between key stakeholders, track project details, and facilitate data map updating. Data maps are key cogs in any IG program, because they help companies see the big picture of where data exists across the enterprise, who has access to it, and how it's being managed (e.g., retention policies).


File Analysis

File analysis (FA) is a tool that you point at a data repository (e.g., network file share or SharePoint server) and get back an inventory of useful information, such as file metadata, how old the files are, who has access privileges, which department owns them, etc. It's one of those enabling technologies that helps other functions happen more efficiently, such as e-discovery identification, server consolidation, data migration, data map creation, and defensible disposition, just to name a few.


Pre-Collection (in-Place) Analytics

The first popular analytics tool was basic keyword search, which remains a very popular method for identifying relevant documents. Recent years have seen the market embrace more advanced analytical methods, such as semantic search, Boolean search, concept search, and predictive coding. While in-place analytics rely on many of these search and analysis methodologies, they are designed to analyze data before collection happens and e-discovery begins. Besides supporting e-discovery efforts, pre-collection analytics allow organizations to conduct data source content audits and find business-critical information more quickly.


Automatic Classification

These systems use algorithms and/or pre-defined rules to assign electronic documents to a classification category (or a file, folder, or tag) on the basis of its content, metadata, or context. Auto classification technologies can take the burden off the end user by eliminating the need for them to manually identify electronic documents. These classifications can then feed into automatic retrieval, archival, and disposal capabilities based on an organization's IG policies.


In-Place Preservation

It's hard to have meaningful disposition policies when data is constantly being collected, copied, and kept for long periods of time. Many companies opt for a preserve in place strategy, which counts on custodians and IT ensuring that potentially responsive ESI is retained during the course of a legal matter once they receive a legal hold. However, people are fallible and some custodians may not be knowledgeable about the company's legal requirements or how data is stored on certain systems. In-Place Preservation technologies integrate with data sources to secure – or "lock down" – data from intentional or accidental deletion, without actually removing files from their native environments.

What's Next

We can't emphasize enough that it's a mistake to think of IG as part of the e-discovery process. Rather, it is the foundation. As we continue through this guide you'll see elements of IG littered throughout each discussion of the various e-discovery stages. The e-discovery process drives important IG decisions and helps to shape the IG program. The important takeaway here is that IG is a necessity for large companies. Organizations can make all the e-discovery process changes and investments they want, but without a solid IG program in place, it's virtually impossible to lower both costs and risks.