How to estimate legal costs from a data breach.

Ryan McGeehan
10 min readNov 15, 2021

We need budget and headcount to mitigate risks. Larger risks should encourage more resources towards mitigation efforts.

Legal costs are wild area of costs… along with costs to the business and regulatory risks. A better understanding of legal uncertainties will help encourage mitigations that avoid them.

The legal costs following a data breach are often larger than expected, but most commonly understood in terms of settlements. Settlements get the headlines, but don’t paint the whole picture of damages worth avoiding.

Each section is aimed to develop intuition around the broad cost areas.

CTRL+F to jump to these topics

Disclosure Complexity: How much work is the initial disclosure?

A breach disclosure efforts will include the following cost areas:

  • Legal Fees: Lawyers will advise on what disclosures are required. Industry regulation, contract commitments, international, federal, and state.
  • Employee Time: First, there will be effort in deciding how to deliver the news. Your organization might not be good at delivering news outside of your normal channels with targeting to specific victim cohorts. (“50k accounts were accessed, 1k, had data viewed, and 35 were compromised, so we need to email three cohorts in N jurisdictions”)
  • Both: A large group of comms, engineering, and lawyers will fight over the language of the actual disclosure. There may be meetings simply to debate word choices trading between human language and liability.

Are you under multiple disclosure requirements? This may add to (or multiply) legal analysis, often with more headcount at a firm. You can approximate this by looking at the typical costs of other legal projects your company engages in and assume it’s at about a typical legal advisory project per disclosure requirement.

Minimum Costs: Assume you’ll write a simple tweet, blog post, or blasted email. Maybe the work of a few people and a lawyer’s time.

Maximum Costs: Assume the worst case for all of these with fully staffed legal support. Worse, you may pay for your counsel to handle the technical burden of emailing victim cohorts. Your disclosure will be written for submission to multiple jurisdictions, for multiple laws, disclosure formats, translated for notification to all customers, white-glove follow up where contracts are in place with customers, and direct disclosure to attorney generals and regulators. You can try referencing a the largest legal project your company has engaged in and approximating costs from there.

Next: A matter of whether you’ll be sued or not.

Litigation Probability: Will it even happen?

A data breach can’t have major litigation costs if no one sues you.

A great paper (Empirical Analysis of Data Breach Litigation) gives us insight into how often data breaches result as litigation. It saw 3.7% of the data breaches in its dataset escalate into litigation. The paper suggests scenarios that would increase or decrease this risk.

This paper is losing its recency (published 2013 with 2000–2010 data). But, it is still relevant for approximations!

Of course, we have to consider the factors from our own company. For example, if we’re employed by a Fortune 10: It would be harder to believe that a class action won’t form when a Fortune 10 company is breached, or a major tech company, or one with millions to billions of users. Nowadays it is an exception if they’re not litigated following a breach.

There’s fair argument that a legal battle is more likely in high profile circumstances with large consumer exposure. B2B breaches involving corporate customers don’t seem to have the same class action risk unless consumers are impacted. This argument isn’t supported by accessible research(to my knowledge) and would require elicitation.

Next, early settlement: Many cases are dismissed before any settlement or judgement. For instance, the Panera data breach class action walked away voluntarily, and there doesn’t seem to be any other litigation to replace it that I was able to easily find.

We’ve only considered the binary event of lawsuit / no_lawsuit so far. Our minimum costs here don’t involve litigation ($0) and our maximum costs see litigation that we still have to discuss.

Next: If we see litigation, how expensive is it?

Multiple Litigators: If litigation happens, how many litigators?

Data breach follow up may come from a single plaintiff or consolidated class action. A medium profile class action can develop million dollar costs just from the consolidation process into a single class action case. These are the administrative costs simply to handle the process of being sued. You might need representation to handle multiple plaintiffs. Counsel might need to appear for those cases before they consolidate which would be a cost to consider before litigation even begins.

A good large example is from Facebook. I can find about 36 lawsuits that consolidated into 100+ plaintiffs for the Facebook / Cambridge Analytica class action in glancing through court data.

The Equifax breach is a data point for the rarely occurring extreme amount of litigation sources. Equifax saw a flurry of small claims cases, hundreds of class actions filed, multiple Attorney General lawsuits, enforcement actions from the CFPB and FTC, and made an appearance before the Senate.

A quick sample of breaches that have seen litigation seem to eventually have 1–3 (post consolidation) plaintiffs to deal with. Large organizations trusted with sensitive data have about 1–5 plaintiffs. Exceptionally vulnerable companies like Facebook see tens of lawsuits with hundreds of plaintiffs and the rare haymaker like Equifax are far(!!!!) beyond this.

Class Actions: Becoming more likely with consumer tech.

The ability for class actions to form has increased as law firms are able to better target victims and encourage them to join in class actions. See the following example of advertisements on Facebook generated immediately after the Zynga data breach:

Discovery Costs: Ranging from zero to absurd.

Discovery is a pre-trial phase where parties are expected to produce evidence for each other. This is often discussed as eDiscovery in terms of automated discovery methods.

The costs associated with discovery efforts in litigation can be surprisingly overwhelming. This category alone can become the greatest area of expense. We can use reference data from this analysis from RAND, and the US Courts to help approximate these costs.

The model to consider involves the number of custodians and the amount of review effort on discovered data before its produced. What is eventually produced might end up being minimal and has no relation to the efforts in discovery cost. Stated better, just because you produce a little, doesn’t mean you didn’t have to discover a lot.

At a minimum cost, you might not have any discovery effort at all, or a trivial effort discovering against an employee or two copy pasting a document.

At a maximum cost, you can estimate that a large % of your employees each with with hundreds of gigabytes of email and disk being discovered that require analysis. The RAND paper shows costs multiple ways to approximate the cost. The costs are usually large but can skyrocket.

Some of these approximations include: Per custodian, Per gigabyte discovered, or per analysis hour by a law firm. If there are hindrances to deduplication or other tools to make discovery more efficient… it spikes the cost by moving closer to manual review and away from products that provide automated eDiscovery.

Remember — this manual analysis is done by outside counsel. If they can’t find ways to reduce the discovery work, you’re paying for manual analysis done by lawyers and paralegals across significant amounts of gathered data.

If discovery even occurs, we have the option of randomizing # of potential custodians, gigabytes discovered, or legal hours of analysis. See the RAND paper for more.

Settlement Costs: Highly variable depending on the business.

Settlement costs vary dramatically and I wanted to get a better sense of how these fluctuate.

I had almost 150 data breach settlements gathered while writing this article to help approximate the range of costs for this essay. They ranged from $50K-$148M, with a median of $1.6M and mean of $13M. The larger numbers tended to be highly visible brands with massive customer bases (Uber, Anthem, Home Depot).

Note: The Facebook $5B settlement with the FTC happened while writing this essay.

Indemnification Costs: Your contract language may multiply costs.

You are on the hook for incident response costs for your customers if you have contracts with cyber breach indemnification language with them.

It’s likely due to sample bias that I have not personally seen this enforced in any incident I’ve ever worked. I almost always see this language removed. Here is example indemnification language:

Sample contract language from educause

If this is somehow relevant to your line of business… you’re likely to pay for multiple approximated incident costs if your breach somehow impacts many customers. For instance, when Sendgrid had an incident, Coinbase was directly impacted. Assuming this clause existed, Sendgrid could have been on the hook for Coinbase’s costs. However, I’ve been unable to find a real world example of this clause being triggered but it supposedly exists more in the .edu space.

Indemnity risk can be approximated after a straightforward analysis of contracts for that keyword.

Minimal cost: No such indemnity clauses.

Maximum cost: If they exist, it’s multiplicative for each customer that triggers incident response related to your own.

Trial Costs: Did you go to trial or not? Was it lengthy?

It’s widely cited that 90–95% of civil cases will settle before trial. You can find some reference class analysis using published statistics from state courts.

This means in those 5–10% of worst cases, you’ll have to further add to your legal fees.

Approximate trial length can be figured by published court statistics as well. Median times from filing to disposition nationally are 8–10 months. Filing to trial median is 27 months. Some court policies aim for 100% of trial time to be under two years, and 90% of them to be within a year, and there’s analysis that shows these are roughly followed. There are exceptions based on the type of litigation you’re trying to approximate.

Additionally, expert witnesses are common in our field. Plenty of data about expert witness rates are out there and should be available from your lawyers.


Minimum cost: Did not go to trial.

Maximum cost: Around one to two years of trial and expert witnesses.

Regulation: Temporary or permanent modifications to business.

I’ve written another essay to approximate the costs associated with a specific regulatory event here, and some redundancy will occur!

I reviewed the types of non-monetary regulatory penalties associated with the ~150 cases I studied. Below, I categorized the regulatory language you might see from similar regulatory litigation. You’ll likely not see all of them imposed in a regulatory event. Only a handful of the following punishments will be applied by a regulator or settlement.

1: Additional disclosure requirements: Regulators may demand that future incident be disclosed to them in addition to your existing requirements. A minimal view will look at this as only an additional cost on future breaches. A large and complex organization with continuous incident response might need additional review to classify breaches as disclose-able incidents or not.

See Reportable Events in this HHS regulation.

2: Policy and Procedures: This ranges from updating and reviewing security policies… to implementing a new security program from scratch. Some of the language around this is incredibly vague and non-specific. The organization is unfamiliar with the governance approach being imposed in worse cases, or don’t already have a compliance org to implement it.

This should not be confused with the following “Hiring” section where specific roles are imposed on the organization. It’s likely you would might need to retain or hire individuals to operate with these policies.

3: Misrepresentation: These are additional rules and penalties when future claims are falsified in the future. For instance: if another breach occurs after you’ve made another product claim about trust or security, consequences will already be agreed upon. See CVS Caremark.

4: One Audit: This may be an assessment of your security or privacy practices. See this Adobe settlement. Often a two person, two week from your reputable security firm, unless the penalty assigns the following structure to audits…

5: Periodized Audits: Many of the regulations call for annual audits of your security or privacy practices. The 10 and 20 year penalties from the FTC usually call for this. See this Twitter settlement.

6: Prescriptive Solutions: Many regulations call out a small handful of specific technologies that need to be implemented. Multi factor, anti-spam, antivirus, and similar solutions are mentioned. These might be included as a result of operating with new policies and procedures (mentioned above) but might be tracked as an additional cost. It also seems that these seem to be imposed on data breach victims that don’t have much of a security program otherwise.

See the Sonic settlement (page 16) for a laundry list of prescriptions.

7: Awareness Training. I call this out specifically as it seems to be imposed independently of other penalties. It’s an oddly frequent prescriptive aspect of settlements. Awareness training is often times per-seat licensing and you can model costs as such. A $3-$25 per-seat range depending on volume licensing seems to capture the budget possibilities here very well. Or, you might hire and build an internal program. This would then be approximated as headcount to run it operationally. In either case, this cost scales considering the employee base of your company.

See the University of Mississippi settlement.

8: Hiring. Many of these penalties require the designation of a CSO, privacy certified leadership, or vaguely defined designated employees to implement a security program. Employee cost is well explored territory. A recruiting sprint might not be if you are time boxed by the regulator to create a team or role by a certain date. You’ll have to estimate headhunter fees (~20% of first year salaries), a recruiting project fee, or perhaps diversion of internal recruiting resources from the business from regular hiring.

See the Ashley Madison settlement.

The takeaway

Yes, data breaches are expensive 🙄. But why? I’ve published a few essays on breach costs.

  1. This review of legal costs (You’re reading it)
  2. Valuation of non-monetary penalties
  3. Estimating the $ of a security incident
  4. Imposed risk (The value of risk organizations)

Having a clear decomposition of incident risks can help communicate what resources you need to mitigate them. The intuition that follows will hopefully help your leadership secure what your security team need to grow and tackle risk.