
The headlines are hard to ignore.
High-profile cases— the scrutiny of celebrity text messages and photos in the Depp-Heard litigation to the release of Epstein-related documents—have turned private communications into public evidence. What was once buried in legal proceedings is now playing out in real time, shaping reputations, careers, and corporate risk.
But these aren’t edge cases. They’re signals.
In this episode of Data Xposure, we explore how eDiscovery has moved into the mainstream—and why it now impacts far more than just legal teams. Because the same types of messages, files, and digital conversations making headlines are being created inside your organization every day.
Doug Austin, a leading voice in eDiscovery with over 30 years of experience and the Editor of eDiscovery Today, joins us to unpack what’s changed—and why so many organizations are still unprepared. From the explosion of collaboration tools to the growing expectations of regulators and courts, he explains how everyday data has become a business-wide liability if it’s not properly understood and managed.
For legal, compliance, and security leaders, this is the shift: eDiscovery is no longer a moment you prepare for. It’s a continuous reflection of how your organization operates.
Because as recent headlines make clear—the risk isn’t just in what’s exposed. It’s in what’s been there all along.
Mike Hamilton (00:01.232)
Doug, thanks for joining Data Xposure Before we dive into everything happening today, take us back a second. How did you get into eDiscovery in the first place?
Doug Austin (00:14.05)
Well, before there was an eDiscovery, I got into what was known as litigation support, where we dealt with paper documents. And I got into it so long ago that the company I got into was Price Waterhouse, which was a big eight consulting firm then. Now they're big four and they're PWC. worked on a large litigation as I think one of my first couple of projects where we built a couple of different databases.
attorney work product database. We also built this cool app because a lot of the documents, the evidentiary documents were on microfilm, microfilm cartridges. So we actually created this program where you could do a search in the database, identify documents you wanted to print out, send it to the app. It would tell you what cartridges to put in.
and you'd get it started, you know, get it spooled up and then hit go. And it would spool to the pages and print out the pages you needed. And to me, the ability to use technology to do things like that and really, you know, drastically shorten the time to get to the evidentiary documents you needed was fascinating to me. And so from that point forward, you know, being in the legal industry and litigation support and eventually then
eDiscovery in the early 2000s when we, you know, when really electronic evidence became so predominant just has become something that has always fascinated me. And I really enjoyed every minute.
Mike Hamilton (01:51.337)
That's great to hear. And you are one of the, I don't want to say founding fathers, but you are one of those very well-known people in the space. And you've carried the torch forward the community to really give it that visibility that probably hadn't been getting 10 years ago, right? So what kept you in it for those 30 years? I know you touched on that a little bit, but I mean,
Was this something as a kid you were thinking, I really am interested in legal proceedings, courtroom dramas? Like, what made you interested in this in the first
Doug Austin (02:29.966)
Like a lot of eDiscovery professionals, you kind of fall into it. And that was the case with me with that project at Price Waterhouse. When I went to college, I wasn't sure what my major was going to be until I took my first business computer class. And that's when I realized
Mike Hamilton (02:34.076)
Yeah.
Doug Austin (02:51.584)
I want to use computers and technology to help solve business problems. So I got a business degree with a concentration in management information systems. And so that was what I knew I wanted to do. And then once I got, had a first legal project, I was like, these are, I mean, these are as serious and significant business problems as there are.
you know, when you're dealing with litigation investigations, what have you. So this is the type of stuff that I was, that I went to school for. And so that really became what I wanted to do from that
Mike Hamilton (03:28.83)
That makes sense. That makes sense. Well, let's dive into the podcast topic. You know, the headline cases that we are trying to draw people in on are the Depp-Heard case and the Epstein files. So let me give a quick overview about these two cases. So the Depp-Heard defamation trial was a highly publicized legal battle between two famous actors sent around allegations of defamation tied to claims of abuse.
The case drew massive public attention with proceedings broadcast widely and debated very aggressively on social media and other platforms. And then the Jeffrey Epstein case involves a sprawling investigation into a financier's accused of operating a large scale sex trafficking network with ongoing legal actions and document releases continuing to reveal the scope of individuals and institutions connected to the case.
Both became global news stories, not just for their legal outcomes, but for the broader questions they raised about accountability, transparency, and how information services in high stakes situations. So Doug, I don't know if you followed these cases as they were coming out. I think it was hard not to, but eDiscovery probably doesn't pop up in people's minds when you mention those two cases to them or just to anyone, but I'm guessing it did for you.
Why was that the case?
Doug Austin (04:59.694)
Well, really, I think at this point, just about any case like this, I think about electronic evidence first. I I can't watch true crimes, like 48 hours or Dateline or any of those anymore without being interested to find out what sort of electronic evidence are they going to have that's going to ultimately lead to identifying the murderer or what have you, because we have such an extensive digital trail that these cases
almost always today involve some sort of electronic evidence that leads to identifying who the killer was. These cases are, they're completely different from that, but they still involve a lot of electronic evidence. mean, Depp-Heard involved a lot of videos and pictures and text messages, the Epstein investigation, government investigation.
a huge amount, think something like three, I think nearly three and a half million documents, 180,000 images, 2000 videos, know, massive scale, a lot of electronic evidence. And I mean, that is, you know, really how, whether it's true crime, whether it's
Basically a defamation case, which was kind of sort of like a divorce case, but just high profile and, you know, defamation issues coming into play. And then a huge government investigation. It still involves electronic evidence. And for me, that's the one of the things I think about first. And I think about the issues that you deal with, you know, in the Epstein case, privacy is a huge consideration. And one of the things we don't talk about enough is the importance of redacting.
you know, evidence because this being such a high profile case, some of this evidence, a lot of this evidence has been made available to the public, but they had to make sure to redact the identities of the victims and, know, in images and videos and documents, what have you, that's got to be a huge undertaking. And the Depp Heard case, it was not just about the evidence, what it appears, but also the underlying
Doug Austin (07:12.588)
metadata associated with that evidence, came into discussion in a couple of instances in that case. So all those types of cases all have those kind of considerations. And that's what makes eDiscovery interesting. It's just not the same thing time after time. The evidence is different. The issues and the requirements of the case are different. And how you handle the evidence is different.
Mike Hamilton (07:39.231)
To me, both those cases brought out a couple of different themes that I'm curious to get your feedback on. With the Epstein files and the production, were you surprised that things weren't redacted? I mean, we're so deep in eDiscovery land, right? Like how normal are you missing redactions, you know?
all the other kind of eDiscovery complexities that were potentially missed there.
Doug Austin (08:12.95)
Unfortunately, they're more common than, than the, you know, we like to think. In fact, you know, the, do I call the redaction flood type of cases we've seen, you know, in addition, in addition to this case where, mind you, they did have a monumental task of trying to make sure everything was redacted. And it's probably understandable. Some instances were missed, which is obviously horrible because that then
puts people's names out there in the public, is a horrible thing. But I I've seen, I think I've just recently covered that a case where once again, somebody put a document into Adobe, but with the, where they did an overlay type redaction and people know that you can just highlight the text and find the text underneath that overlay. That's a 101 type of redaction, 101 type of.
error and people still make it all the time. So it doesn't surprise me and unfortunately it just continues to scream the need for really good understanding of eDiscovery fundamentals and things like understanding what you know when data is properly redacted and how to do it and make sure that you can confirm that it is redacted.
Mike Hamilton (09:33.703)
Yeah, and it seems like there's such a bigger microscope put on these eDiscovery type activities now, but I don't think maybe I'm not clued in specifically with the legal market, but I don't think the legal market's making that connection is as, because to me those seem like very direct cause and effect eDiscovery processes, best practices. Are you seeing like
when you're out there talking to other people that they are really understanding these connections and how taking eDiscovery maybe a little more seriously is warranted.
Doug Austin (10:16.078)
Well, I mean, I think there's two types, two groups of people. There's the eDiscovery bubble or, you know, the people who really have taken eDiscovery seriously and tried to learn about it and learn the best practices and what have you. But then there's, you know, an extended, in the legal, you know, legal community, there's an extensive group of folks who don't know that much about eDiscovery.
They could be sole proprietor type of attorneys. could be, who knows? I mean, there's 1.3 million attorneys in the US. And so when you think about it, not all of them are going to, unfortunately, understand the best practices. And it's just as important for the small cases as it is for the really big cases. There's still electronic evidence involved. There's still best practices that come into play. So, yeah, I, you know, I think that...
We're making headway. There's always more work to do, which is why all the things that Exterro does with education and what I try to do and what others try to do, every little bit helps. And I think it's going to be one of those continual battles. We're going to continue to try to educate people and hopefully minimize that group that doesn't know those best practices and fundamentals that they need to know.
Mike Hamilton (11:40.446)
Doug, I need to pause real quick. I'm sorry. My smoke alarm is beeping because it needs a new battery. So let me just go grab it real quick. I'll be right back. I apologize.
Doug Austin (11:46.038)
Okay. Okay. No, no worries.
Mike Hamilton (14:42.718)
Okay, I apologize, Doug. There's so many smoke alarms in my house that I couldn't figure out which one it was coming from. And so I had to wait for it to beep again. Exactly.
Doug Austin (14:54.328)
Right, no, I know that drill. Yeah, so let me take a minute actually here because my forehead sometimes gets a little shiny. So I'll go ahead and minimize that a little bit. Yeah, yeah, I had to chase one down a while ago and yeah, it takes a little time to figure it out.
Mike Hamilton (15:02.546)
Yeah, yeah, so am I.
Mike Hamilton (15:12.028)
Yeah, yeah, so I apologize. I apologize for that. OK, so let's get back. OK, I'm going to get into the next section here. So 321.
Doug Austin (15:14.722)
No worries.
Mike Hamilton (15:25.992)
Doug, if you were an organization, what data would suddenly be in scope that most teams aren't thinking about? And I think this really showed up a lot in the Depp -Heard case, right? Even though this was between two actors, there was some data that service that they probably didn't think was in scope or that could be found.
Doug Austin (15:49.911)
Yeah, well, mean, so, you know, there were obviously the communications between them and things they used to document what what happened in their marriage and what have you. So, you know, obviously text messages, you know, couples text each other all the time. Obviously, some of those text messages were at issues that basically puts mobile device data at issues. Also, because every mobile device has a camera.
A lot of pictures, a of videos. So that was a very mobile device intensive case because that's where a lot of communications, a lot of our activity is spent these days. I think many of us get like a weekly report of how much time we're on our device each day average. And it's usually multiple hours for just about any of us. So we live on these devices and that case was key.
What was also key is the underlying data associated with some of this metadata. I covered, I wrote about the case a couple of years or so ago when it was going on because one of the notable things about that famous Amber Heard picture where she basically showed bruising was the metadata showed that had been saved in a photo editing program called Photos 3.0.
Which of course, you know, doesn't guarantee it was edited, but it doesn't look good when it's saved in a photo editing program. So that's the type of stuff, that underlying metadata, you know, that goes with the evidence. I think that case really highlighted that, you know, consideration more than we see in cases now. And then of course, you know, these days, you know, that case was probably looked early for it.
But we're seeing more and more cases involving AI generated evidence, you know, chat logs with chat GPT or Claude or what have you. can't it seems you can't almost get on a zoom recording a zoom meeting these days without having somebody throwing out their recorder note taker. I even do zoom webinars and they pop into those and then all the artifacts you get from copilot. So there's just so.
Doug Austin (18:12.866)
there's always new types of data, new types of evidence we have to account for, which I guess gives us job security, but certainly keeps us up at night too.
Mike Hamilton (18:23.166)
100 % and I guess it depends on what Forum you're in. right? But if you're in the civil context, some of that may be unduly burdensome, but I guess if you're in the criminal context, it's fair game, right?
Doug Austin (18:35.81)
Sure. Yeah, absolutely. And I mean, I think that's, you know, each case has a different kind of consideration as to what evidence might be important and might be proportional. You know, a case like Depp-Heard where I said mobile device data is probably a majority of the relevant data involved in the case. So that's obviously where you have to get into that. Whereas maybe in a corporate case involving
you know, communications, maybe those mobile devices are less important. Plus then, of course, you might have BYOD considerations where the organization might not even be responsible for those devices. And we've covered, on eDiscovery today, we've covered a few cases where they weren't found to have possession custody control of BYOD devices because they had a policy in place that established their limitations. And in a couple of the cases,
The party also subpoenaed third party subpoenaed the employees, but for those data on those devices, one case they didn't, and paid the price when the court said that wasn't within the possession custody control of the defendant. So yeah, you get those types of considerations and the different types of cases. One of the cases that tends to be big from a forensics consideration is IP cases, where
An employee leaves an organization, either goes to a competitor or starts their own company. Did they take data? Did they not take data? You get into forensic examination of their existing devices that they had at the organization, new devices that they have that could be mobile devices, that could be computers. So the different types of cases involve different types of not only data you go after.
but how you go after it. And I think that's one of the things that continues to make eDiscovery such a discipline. That's the word I'm looking for. Such a diverse discipline because case to case, the data and how you go about getting it is always different.
Mike Hamilton (20:52.958)
One question I have as a follow up to that Doug, right? We've talked a lot about mobile and some of these new data sources. I think we saw this shift when we went from paper to email, right? Practitioners kind of getting up to speed on, how do I actually collect email? Right? I'm used to paper. What do you see from a practitioner perspective on their ability to
effectively collect these new types of data and how is the court really treating that?
Doug Austin (21:31.065)
Well, you know, it's interesting because we have so many different collaboration and communication apps out there now. So we used to print it primarily communicate through email. Now we communicate through text with our mobile devices. We communicate within our organizations, within Slack, within teams. You know, a lot of organizations have both Slack and Teams. And then there's so many other, you know, there's
I mean, you name them, there's just so many other different collaboration apps with data that you're sharing and working on and what have you. And all these cloud-based systems where that data has basically become data that we're pointing to instead of just putting into a communication or what have you, which of course then means that underlying data can change from when the person saw it, when that communication was sent.
So these are the types of issues that I think we're struggling with now. I think just like when email first came along and we struggled on how to exactly work with that in the best way possible and platforms evolved over time and we standardized workflows, we're starting to do that with a short message format type of system. with mobile device data with.
Slack with Teams, we're standardizing workflows. Platforms have APIs to go get that data and do it in a standardized way to upload the data to have it be in the best format possible to then work with and make determinations and what have you in discovery. The thing is, is you're always feeling like you're chasing something because just as soon as you get these platforms established to a workload for,
along comes a new platform and you've got to do the same for that. So it's kind of a never ending battle and you definitely have to be working with a platform that can be flexible and consistently stays up on what the common data sources are out there and provides a standardized workflow and automation capability to get that data in in a useful format when you're dealing with it in discovery.
Mike Hamilton (23:47.452)
A big point that I think the listeners should really...
take seriously is this is all within the discovery parameters, right? There isn't a, well, it's too hard for me to get to that, or I don't know how to do that. Right. Would you agree with that Doug that this is like, even though there's all these data sources and it's becoming increasingly more difficult to keep up, there is still that duty to keep up.
Doug Austin (24:22.062)
Absolutely. you know, I'm not sure if I addressed the court part of it in your previous question, but yeah, plenty of cases now, plenty of case law where parties have been required to go after that data, to go after the Slack data, the Teams data, what have you. So yes, I mean, this is something courts are expecting. This, you know, it is again, you know, rule 26B1, federal rule,
federal rules of civil procedure, you know, deal with proportionality. And obviously you have to kind of establish the proportionality of it. But in many cases, more and more cases, that data is becoming the critical data in the case. So then it's really pretty easy to make a proportionality argument that, yes, this is proportional to the needs of the case and you've got to have a way of, you know, going through it and producing potentially, know, producing responsive.
data from these systems in cases.
Mike Hamilton (25:23.696)
And in the Depp- Heard trial context really drove impact with threads, tone, timing. How does eDiscovery reconstruct that kind of narrative? Can you give me some insight into what you're seeing with legal teams and how they're constructing this narrative? Is it still very manual? Is it automated? But as we saw, the context means everything. So
these little bits and pieces, what you just touched on about the photo editing software, case in point, it all matters.
Doug Austin (25:59.191)
Absolutely. yeah, I mean, I think they're, you know, one of the things in the introduction of AI capabilities into the mix does help some with some of the context determination, because then you can get some at least kind of, you know, kind of pointers in the right direction of kind of like how, you know, the tone might be, you know, I know, for example, I know that, you know, there are capabilities to say, look for
What's the term I'm looking for? I've got to pause for a second here. What is it? Not sensitivity, but what am I trying to think of? You're trying to determine their feeling. What's the term I'm looking for? Sentiment. Sentiment, that's what. Sorry, just long week. I'm going to start that one over. All right.
Mike Hamilton (26:53.969)
Okay.
Doug Austin (26:55.434)
So yeah, absolutely. You know, and I think there's technology that helps with that. You know, lot of, you know, with AI, we're beginning to see the ability for it to do things such as sentiment analysis to determine when people are angry or, you know, are frustrated or what have you that can help with context. And it's also again, sometimes the, you know, the metadata that can, that can impact that.
So I think it's probably a little bit of, you know, technologies helping us and then understanding how to use the technology and then, you know, supplement that with best practices. There was a case a few years ago, you know, speaking of context where that involved fabrication of a text message, the Ross back case. And it was basically this lady was suing.
her former employee for sexual harassment or employers the supervisor had what looked like a incriminating text message turned out that She produced an image of it and the image she produced had a heart eyes emoji associated with it that came from a that came from support reportedly according to her came from her iPhone 5 and it was determined that
that particular version of the heart eyes emoji wasn't possible with the iPhone 5 because it didn't run iOS version 3 version 13. So that's the case of the smoking emoji. So, you know, emojis are an example, a great example of context. You know, you have the need to, you you say a statement one way and it can be taken as like a threat and then you could put a smiley face and it's taken completely differently. So that's another example of how
data and you know things like emojis and things like sentiment analysis can be used to really get a sense of the context of the evidence not just you know evaluating it at face value but using these different kind of cues to determine just exactly what that evidence really means.
Mike Hamilton (29:06.782)
Let's move over to the Epstein related documents real quick. A lot of older data resurface years later. Now, I don't want to get into the legality of like, should that data been kept or should it not been kept? But let's apply this to an enterprise specifically. For enterprises that are trying to reduce liability risk, what does it say about data retention and long-term exposure? And how are companies underestimating
really how long data sticks around.
Doug Austin (29:40.419)
Yeah, I mean, I think a case like this, because you're not talking about a huge organization, you're talking about a financier, I'm sure had a number of contacts and what have you, but three and a half million documents or three and a half million pages, that kind of indicates there was likely not much of a data governance program that they operated under. And absolutely, that's one of the things that you have
keep in mind because organizations that don't treat data governance seriously and establish retention periods for their data and what have you, not only does that data potentially become subject to discovery and litigation, but it also is data that could be exposed in a cyber attack and things of that nature. There's so many reasons to have your data house in order.
And I think organizations have a lot better understanding of that now than they used to. And I think they're using technology to help with that. And I think for a lot of the systems they use, they're implementing things like, you know, they're implementing things like automated, you know, deletion when a retention period expires, as well as then, of course, what's important is once you have something like litigation, suspending that.
and automating the legal hold process or hold in place or what have you, that's all part of that equation that has to be considered. And so I definitely think that organizations understand a lot more, but there are challenges. One of the big challenges we have in organizations today is shadow IT and shadow AI. People using systems that aren't approved and creating data that could be discoverable.
and they're not on top of it. So it's always a challenge of trying to make sure you're understanding what people are doing. And I think that's one of the things that you have to do upfront. You have to not only make, establish, you know, a data governance program, establish retention periods for your data, but you have to really take steps to try to understand as much as you can about.
Doug Austin (32:00.483)
your data landscape out there and what people are doing and making sure that you're aware of the risks that are associated.
Mike Hamilton (32:11.112)
Doug, if we remove the celebrity factor, how similar are these scenarios that we talked about within Depp-Heard and Epstein to what enterprises deal with within litigation, within internal investigations, within subpoena requests, right? How similar is this?
Doug Austin (32:32.846)
You know, I think that, you know, probably if you took away the, let's take the Epstein one first. If you took the public aspect away from that one, obviously if this were a case that involved the same kind of topics, you'd still have redaction requirements, but you probably have something.
That would be, you'd have some protective orders in place. You'd have some certain things to try to kind of attorneys, I with attorney's eyes only and what have you. Yes, you'd still want to redact and do the things you can do to protect privacy. But when it's between parties, it's not as big an issue as it is something that's so public facing as this case. So that's definitely one consideration that I would say would be different is, you know, the public aspect of, know, just what's out there and how important it is to maintain privacy.
and manage that data in Depp- Heard, that really, without them being famous actors, it's pretty much almost like a divorce case. And we see those types of cases, there's probably hundreds of thousands of them every year. There are certain types of data.
You've got to, you know, you've got to make determinations. You know, parties might be trying to make the other one look bad or, you know, it may just be like the data that involves what, you know, dividing assets into divorce or what have you. Those issues I think remain the same. And obviously, the considerations of, you know, evaluating that and making, you know, determinations, I don't think that one differs as much because I think, you know, a lot of those considerations would be the same.
If there was a defamation case involving two people who weren't widely known, I'm sure you'd have some of the same things that just wouldn't be on court TV being shown every day because, know, that's a that's the cases like that happen all the time.
Mike Hamilton (34:31.634)
Yeah, in just like an employment related matter, right? It's kind of a he said, she said almost in the same fact. have a company saying something, you have an employee saying something, you know, we're going to try to get the context of what really happened. So I think that there are ripple effects where, you know, I, I, I, I catch myself doing this all the time. I'll be like you, I'll be watching the news. like, what are the eDiscovery things that are happening in that case? And
A lot of those things do bubble up now of, how did this get out there? Or why did it take them so long to review this many documents? Like we saw a lot of that with some of the back and forth with the government in reviewing certain files and how long that discovery would take. remember them saying it would take months and you and I both know if you have the right amount of technology, you can really boil that down and.
and streamline that.
Doug Austin (35:33.487)
Absolutely, yeah. you know, the Epstein case, a lot of videos, a lot of images, you know, we really didn't until fairly recently have the technology where we could get a better understanding of what's in those quickly. Obviously, you know, we have transcription capabilities, AI based transcription, so we can quickly figure out what's contained in videos that might be useful to us. From an image standpoint, you know, I think one of the
most under discussed capabilities of AI that really is a significant benefit to eDiscovery professionals is the ability for AI to look at an image and tell you what's on it and give you the ability to very quickly kind of do a triage of which images, you know, can I just cast aside immediately and say, okay, these aren't going to be important to look at. I won't even have to open them up, but these might be, you know, and I.
I think this was probably a couple of years or so ago when Chat GPT added their image capability. I just took a picture of my cluttered kitchen counter and it basically identified all the items that it saw right down to the Nutella jar and stuff like that. It's amazing just how good the technology is these days and how you can apply it.
Mike Hamilton (36:44.488)
Yeah.
Doug Austin (36:53.9)
They may have applied it here. Obviously, the stakes were very high, so they had to still make sure they got the job right. But there's really no excuse for not exploring applying technologies like this and applying AI to transcribing videos and understanding what's in them, identifying contents of images, and using that to help streamline that process. It just seems like a no-brainer to me.
Mike Hamilton (37:18.556)
Let's dive into AI a little more. You brought it up a couple of times already, but for you, what's actually standing out for you in the legal space when it comes to artificial intelligence? What do you feel is delivering real value for legal teams today? And what do you feel like is a little bit overhyped?
Doug Austin (37:40.655)
Um, yeah, I mean, we can't do this without talking about AI, right? So, um, uh, you know, I think, um, I think what, what's not being talked about enough is all the different use cases. can conceivably apply AI to, the last couple of years I've done the state of industry, the state of the industry report. And I've asked a question. What use cases are you applying generative AI to? And, know, it's things like.
Mike Hamilton (37:45.842)
Yeah.
Doug Austin (38:07.95)
document classification for, know, for relevancy classification, document summarization, ECA, PII and privilege identification, stuff like that. And people are using it for those things. And I think that's one of the things that is so unique about generative AI and the possibilities is the possibilities are really endless. I continue to hear, you know, all the time about
You know the thing how it's being applied in ways that I'm like hey I thought of that that makes that's a really cool application of the technology so, you know, I mean I've you know, I've heard of the ability to do things like Do a you know do a prompt and say go find financial documents and not only give me the data back Put in a little table that I could throw into Excel stuff like that, you know, really amazing stuff
And, you know, I think there's certain no-brainer use cases like using it for ECA where there's not a defensibility burden. It's just basically trying to get a better understanding of your collection and things like that. Probably, you know, from a standpoint of document classification, I think people, just like was the case with TAR, people are all up about, you know, how can we make sure it's defensible and what have you. And I think the same best practices that apply to TAR apply to GenAI in terms of the ability to
you know, to validate the results and what have you to ensure that, you know, you've got a defensible result. So I think, you know, the technology, I think one of the things that, you know, has to be said about the technology is it's not an easy button. And a lot of people want to take it that way. It's the technology capabilities are great, but you still have to apply best practices along with that technology to use it the right way.
And that's what I think, you know, is probably not being talked about enough. People are hyping the capabilities, but they're not hyping the importance of understanding how to use those capabilities in the right way.
Mike Hamilton (40:16.328)
Definitely. I want to piggyback on that point and talk a little bit more about AI governance. And, you know, we've talked about hallucinations and there's also been, where is that data actually going? Right? When you prompt something within an AI tool or you upload, I think we've seen cases, you probably can cite the case for me, Doug, but we've seen cases where I think it was from one of the big tech firms. They uploaded some like
code into one of the chat bots. And now is this in the public domain? Can anyone use this? talk to me a little bit more about AI governance and what you feel are the things that attorneys and legal professionals, because they're a little skeptical, right? So what are those things that they need to be watching out for when they're looking at what AI they should be using?
and how they should be constricting their use and ultimately the enterprise's use of AI.
Doug Austin (41:19.724)
Yeah. And I think you mentioned some really great, well, we'll take hallucinations as an example. Obviously we've seen, think we're up to well over 1300 cases, hallucination cases in the Damien Charlotin database tracking cases with, you know, that have hallucinated filings or that are suspected VAI. it's, mean, and some of these have been from, you know, major firms.
So, you know, the, I think when it comes from an AI governance standpoint, there's education and people need to understand, you know, what these tools can and can't do and, you know, and understand the importance of verifying the results, because that's always an important thing to do. But I think they also need to kind of standardize it within their workflows because
If you know it, but then you know what happens. Sometimes people get in a hurry and they cut corners, they skip steps and they were like, Oh, got it. Got a deadline. They got to get this out. Um, they just skipped that step. And then, you know, you've, you've, you've wound up with, you know, uh, filing a, uh, submitting a filing that has hallucinated cases in it or, or hallucinated facts or what have you. So I think it's the workflows as a part of that, you know, making sure that.
you know, you're standardizing and ensuring that you follow those standardized workflows. I also then think you have to have policies about how to use these tools and what's acceptable and what's not acceptable. Because yeah, I'm sure there's plenty of organizations there, especially with shadow AI on the rise, where some employee has probably loaded some sensitive documents up there to try to, you know, use
to get chat GPT or Claude or what have you to analyze those documents and provide some feedback on it. And that's definitely not something you want to have happen. We just had a case recently where the court basically said the parties can't load any documents involved in the litigation to public LLMs like chat GPT or Claude. They can only use closed in.
Doug Austin (43:40.271)
type of platforms like eDiscovery platforms that have AI capabilities for that. And I think that makes a lot of sense because I think you have to be careful about where you're putting that data and what you can do with it. I think organizations have to be mindful of that. I think a very standard aspect of litigation today is protective orders or other agreements between parties that says, you won't take the data I produced to you and upload it in a public LLM.
Mike Hamilton (43:40.414)
Yeah.
Doug Austin (44:09.356)
I think parties have to protect themselves against that because that's something that, you know, wasn't a factor, you know, just as much as a couple of years ago. But now it's something that probably happens pretty commonly. And I think a lot of organizations and a lot of legal teams have said, we've got to make sure that that's an understanding we establish upfront in a, in protective order and the ESI protocol, or even just communications with the parties. And it may be multiple of those.
It can't just establish and reestablish this state has been produced to you not to be put into a public LM where it could be then used to train the model and conceivably be out there.
Mike Hamilton (44:49.916)
Yeah. In looking at AI that is built with those parameters in place, right? Not leveraging something like a chat GPT but upgrading to the business license, right? Where everything's going to stay. Anything you upload is going to be within or behind your firewall. I do want to give a shout out to the eDiscovery community because I feel we've
been at the forefront of AI for a while. Like you go back to Da Silva Moore, that was in what, 2012? Right? And Judge Peck's been talking about this for almost 15 years. And even though it's not gen AI, let's still consider back in the day, artificial intelligence. for anyone that's speaking on it from an eDiscovery perspective or has been in the industry for a long time, kudos to all of you for being at the forefront of all of this, right?
Doug Austin (45:30.008)
Yeah.
Mike Hamilton (45:48.936)
So.
Doug Austin (45:49.391)
Yeah, I mean, TAR is another form of AI. we use, we genericize AI, but there's so many different ways and applications of AI, what we had back then, what we have today. But yeah, absolutely. mean, we've been using AI as long as anybody has in the legal community. So any discovery has been at the forefront of that.
Mike Hamilton (45:55.326)
All right.
Mike Hamilton (46:16.318)
So I just have a couple more questions for you, Doug. Thank you for sticking with me. It's been a really great conversation. Let's look ahead. We've seen in the coding computer development world, AI has been coding. It's been, I listened to a podcast episode where Anthropic is using AI agents to code, but then they're having AI agents supervise the coders.
And they have supervisors to the supervisors that are agents as well. So it seems like things are moving in that direction. Let's talk about in the context of legal eDiscovery, risk mitigation, all these different types of legal workflows. What does autonomous AI in these formats look like and what are you most excited about and what are you worried about?
Doug Austin (47:08.302)
I think from an eDiscovery legal standpoint, autonomous AI looks like a managed process where you have certain processes or things that are repeatable that you can apply autonomous AI to, but with humans in the loop, human managed process. mean, case in point, you could say that
the automated classification of documents to determine, you know, whether to identify whether they're responsive or non-responsive, or at least make a recommendation as to whether that's like an, that's an agenda process essentially, but it's human managed. You know, you're providing the, you know, lot of people best practices is prompt iteration. You create a prompt, you run it through a sample of documents, you evaluate it, you make some adjustments, and sometimes you have multiple iterations where you get the prompt where you want it.
that then apply to the document collection, then you still validate on the back end. And a lot of these are human-based processes around an automated process. I think that's always got to be the case when we're talking about legal because the stakes are so high and no technology is ever going to be perfect. You still have to have some processes in place that you can defend your approach on in terms of how you've
arrived at the data that's being produced or the data that you've identified for redaction or whatever the case might be. And so to me, I think it's always a mix. There are certain processes you can identify and pinpoint and say, is a good, I can apply an agentic process to this. And really, it's not so much different from before AI, we still had
agents or things that, know, automated things that we did, we applied to various processes and, you know, took certain steps that were automated. Now we're just doing it with AI. So the key is, is just kind of understanding, you know, what it will do, making sure you're prepared for the results and don't just, you know, take any, you know, always the trust but verify. Don't take any output as, you know, face value.
Doug Austin (49:27.074)
make do some validation checking on that make sure it's a defendable result that you know that you can stand by.
Mike Hamilton (49:34.908)
Last question for you, Doug. If someone listening today was looking to reduce their legal risk, what should they go check on tomorrow? What's the first thing they should be checking on tomorrow?
Doug Austin (49:48.879)
That's an interesting one because I want to say a couple of things, but certainly I think when it comes to reducing legal risk, understanding your data. mean, we talked so much about the technology, but technology can't work without the data. So the more you understand about your data, the better. So that's having solid data governance policies in place.
having technology to support that, continuing to try to stay up on what the people in your organization may be using. As you know, with shadow IT and shadow AI challenges, there may be data sources out there in your organization you're not aware of. So I would say that's probably step number one. And then obviously step number two is,
have processes and plans in place on how you handle litigation, investigations, what have you, before the case is filed, before the investigation's launched. Because, you know, once you're trying to figure it out, once after the case has been filed, you're already behind. So you have to be prepared for, well, how would I handle a case like this? And what processes and procedures do I need to have in place so that when that case comes along, I know...
You know, step number one is look at these data sources, look at these potential custodians, do this, do that. You know, set legal hold in place, you know, all that sort of stuff. You need to have those defined up front, not once the case starts. So those, so I would give you two, not just one.
Mike Hamilton (51:32.072)
Well, Doug, it's always great talking with you. One, the great minds in eDiscovery. Thank you for taking time to chat with me and be on the Data Xposure Podcast.
Doug Austin (51:42.382)
Thanks Mike and great talking with you as always as well and thanks for having me.