Not legal advice; educational only. Cases are described according to their public posture; settlements resolve allegations without an admission of liability.
The pandemic produced the cleanest data-miner cases ever filed, because it produced the cleanest dataset: the Small Business Administration published the Paycheck Protection Program loan data itself. Who borrowed, how much, under what entity. Pair that public file with one eligibility rule, and you have a repeatable fraud screen, which is exactly what a handful of analytics-driven relators built.
The rule that did the work
PPP eligibility turned on size. A borrower generally had to be under 500 employees, but the SBA’s affiliation rules require commonly owned businesses to count all of their affiliates’ employees toward that cap. A second, separate eligibility problem was undisclosed foreign ownership and control. Both are violations you can see in public ownership and loan records without ever setting foot inside the company. In ACFE (Association of Certified Fraud Examiners) Fraud Tree terms, these are false-statement eligibility frauds, a certification that was not true when it was made.
The cases
Sidesolve, Inc., an analytics firm, no insider, worked the public PPP data and surfaced a roofing company with eight affiliated entities that had each certified under the 500-employee cap separately, when their combined headcount put them well over it. The matter settled for \$9 million in late 2023 in the Northern District of Texas, and the data firm received \$1 million as the relator. One defendant, one public dataset, one rule, a seven-figure award.
GNGH2, Inc., the vehicle of New York lawyer David Abrams, turned the same approach into a prolific practice, filing PPP cases nationwide on public SBA loan data and public ownership records, targeting both affiliation violations and undisclosed foreign (often Chinese) ownership. Public settlement announcements include a string of resolutions:
| GNGH2 / data-miner PPP settlement | Amount | Reported relator share |
|---|---|---|
| YAPP USA Automotive Systems | \$14,208,496 | \$1,420,849 |
| Horn USA, Inc. | \$4,153,111 | |
| BWI-related entities (three Chinese-owned cos.) | >\$21.6M (2025) | >\$2.1M |
The point is not any single number. It is the shape: a repeatable public-data method, run as a disciplined pipeline against a clean dataset and a clear rule, compounds into recovery after recovery.
The honest caveat
Two things keep this from being a get-rich scheme. First, the first-to-file rule is brutal here, public PPP data is reproducible by every other analyst, so only the first relator on a given borrower recovers, which is why these cases moved fast and quietly. Second, even with a clean dataset, the SBA-relief settlements skew heavily toward government-initiated matters, which the Justice Department reads as data-miner filings having a lower success rate than the headlines suggest. The dataset is clean; the litigation still has to be won.
The takeaway
The PPP cases are the modern template precisely because the dataset was public and the rule was bright-line. That combination, official public data plus a specific, identifiable eligibility requirement, is the same one that powered the cardiac-device case and the same one a public-data screen is built to exploit. Find the official dataset, find the rule it can be measured against, and the leads find themselves. The work, as always, is everything that comes after the lead.
Why the SBA cases were so reproducible
The reason a half-dozen analysts could run the same PPP play is worth dwelling on, because it is the data miner’s dream and the data miner’s curse at once. The eligibility rules were bright-line and public-data-testable: the 500-employee cap, the affiliation rule that aggregates commonly owned entities, the foreign-ownership disclosure requirement. None required a judgment call about clinical necessity or a defensible business interpretation, they were facts you could read off the SBA’s own file and the corporate registries. That is what let them survive a motion to dismiss where a pure-statistics theory would not.
The curse is the same property in reverse. Because the analysis is reproducible, it is also contestable by the next analyst, the first-to-file rule means only the first relator on a given borrower recovers, so the same transparency that made the cases easy to build turned them into a race.
| The PPP data-miner record | Figure |
|---|---|
| Largest single settlement in the set (BWI-related entities) | >\$21,600,000 |
| YAPP USA settlement (reported relator share \$1,420,849) | \$14,208,496 |
| Sidesolve / Empire Roofing settlement | \$9,000,000 |
| Share of pandemic-relief FCA settlements that were government-initiated | the large majority |
Sources: DOJ N.D. Tex., Sidesolve \$9M; Crowell & Moring, PPP FCA settlements.
That last row is the honest corrective to the highlight reel: the Justice Department reads the government-initiated majority as a sign that data-miner filings succeed less often than the splashy individual awards suggest. The dataset was clean; the litigation still had to be won.
By Noah Green CPA CFE, for Sheepdog Prosperity Partners. Educational only; not legal advice.
Primary sources: DOJ N.D. Tex., National roofing company settles PPP allegations for \$9M (Sidesolve) · Crowell & Moring, Recent PPP False Claims Act settlements (YAPP, Horn, Rosler) · Grubman Warner Berry, BWI \$21.6M PPP settlement (2025) · 31 U.S.C. § 3730 · ACFE, Fraud Tree
