Benford's Law in Excel: Setting Up the First-Digit Test for Mass-Transaction Screening

The first-digit distribution of naturally occurring numerical populations follows a logarithmic pattern. Auditors leverage this pattern to screen transaction datasets for anomalies. The implementation requires precise Excel formulas, statistical validation, and strategic Claude augmentation at decision points. This article demonstrates the complete workflow from data extraction through workpaper documentation.

The operational challenge lies in edge-case handling. Zero-value records require explicit exclusion. Sub-dollar amounts need specialized digit extraction. The chi-squared test demands degrees-of-freedom documentation per PCAOB AS 2305.09. Sample-size constraints invalidate results below Cochran’s threshold. Misapplication to inappropriate datasets generates false positives that consume investigation resources. The practitioner must validate applicability before testing, interpret deviations systematically during testing, and document conclusions precisely after testing.

Mathematical Foundation and Digit Probabilities

Benford’s Law states that the probability of digit $d \in \{1, 2, \ldots, 9\}$ appearing as the first significant digit equals:

$$P(d) = \log_{10}\left(1 + \frac{1}{d}\right)$$

The formula produces these exact probabilities (shown to four decimals for Excel precision):

Digit $d$	$P(d)$	Cumulative
1	0.3010	0.3010
2	0.1761	0.4771
3	0.1249	0.6021
4	0.0969	0.6990
5	0.0792	0.7781
6	0.0669	0.8451
7	0.0580	0.9031
8	0.0512	0.9542
9	0.0458	1.0000

Excel’s =LOG10(1+1/d) computes these values using IEEE 754 double precision. The sum $\sum_{d=1}^{9} P(d) = 1$ holds within floating-point tolerance ($\pm 10^{-15}$). When implementing, the auditor verifies =SUM(E2:E10) returns 1.0000 exactly. If rounding errors accumulate, applying =ROUND(LOG10(1+1/D2),4) matches the table above.

Processes spanning multiple orders of magnitude through multiplicative dynamics generate logarithmically-spaced values. Leading-digit frequencies reflect this spacing — smaller digits cover wider logarithmic intervals.

Applicability Criteria and Counter-Examples

Benford’s Law applies when data satisfies three conditions simultaneously:

Multi-order spanning: Population ranges across at least two orders of magnitude
No artificial bounds: Values arise from natural business processes without caps or clustering
Non-sequential generation: Amounts represent measurements, not assigned identifiers

Validated Applications

The following account types consistently satisfy Benford’s preconditions:

Accounts payable disbursements spanning $10 to $1,000,000+ (vendor payments across size tiers)
Sales revenue by transaction for heterogeneous customer bases ($5 retail to $50,000 wholesale)
Inventory carrying values at SKU-level detail (unit cost × quantity on hand)
Tax return aggregates across entity populations (line items summing subsidiary activity)
General ledger entries excluding systematic allocations (manual journal entries, adjustments)

Documented Non-Applications

These data types violate Benford’s preconditions and produce spurious rejections:

Payroll amounts within grade bands ($65,000–$75,000 for Grade 7 employees)
Check numbers and invoice numbers (sequential assignment produces uniform digit distribution)
Retail prices clustered near psychological points ($9.99, $19.99, $49.99)
Filtered populations with cutoffs (“all transactions above $10,000” removes digits 1–9 below threshold)
Fixed-width identifiers (5-digit zip codes, 9-digit SSNs with leading zeros)

Claude Pre-Test Screening

Before running the test, the auditor deploys Claude to validate applicability:


Analyze this 200-row sample from Accounts Payable for Benford applicability:
$234.56, $1,892.00, $45.23, $12,456.78, $3.99, $567.00...

Check three criteria:
1. Order-magnitude span: Do values range across 10x, 100x, or 1000x+?
2. Artificial bounds: Are values capped at policy limits or clustered at round numbers?
3. Sequential patterns: Do values increment systematically like ID numbers?

Return: APPLIES | DOES_NOT_APPLY | INCONCLUSIVE
Include one-sentence rationale citing the binding constraint.

Example Claude response: “APPLIES – Values span $3.99 to $12,456.78 (3000x range), show natural variation without policy caps, and exhibit non-sequential amounts consistent with vendor payments.”

Excel Implementation with Edge-Case Handling

The practitioner sets up the test on a single worksheet. Transaction amounts occupy column A starting at row 2.

Extracting First Significant Digits

For standard implementations where all amounts exceed $1:


=IF(ABS(A2)=0, NA(), VALUE(LEFT(TEXT(ABS(A2),"0"),1)))

The formula explicitly handles zeros by returning #N/A, which Excel’s COUNTIF ignores. For populations containing sub-dollar amounts requiring inclusion:


=IF(ABS(A2)=0, NA(), IF(ABS(A2)<1E-10, NA(), VALUE(LEFT(SUBSTITUTE(SUBSTITUTE(TEXT(ABS(A2),"0.################"),"0.",""),".",""),1))))

This variant strips leading “0.” from fractional values, then extracts the first significant digit. The additional check for values below 1E-10 prevents scientific notation parsing errors. The auditor documents the formula variant applied in the workpaper.

Computing Expected Frequencies

Create a reference table in columns D:F with headers “Digit”, “Benford_Prob”, “Expected_Count”:


D2:D10 = {1;2;3;4;5;6;7;8;9}
E2 = LOG10(1+1/D2)  [copy down through E10]
F2 = E2*COUNTA($A$2:$A$10000)  [copy down through F10]

Verify probability sum: =SUM(E2:E10) should return 1.0000.

Tabulating Observed Counts

In column G, calculate observed counts for each digit:


G2 = COUNTIF($B$2:$B$10000, D2)  [copy down through G10]

Column B contains the extracted first digits from the earlier formula.

Chi-Squared Test Statistic

The goodness-of-fit statistic compares observed counts $O_d$ to expected counts $E_d$:

$$\chi^2 = \sum_{d=1}^{9} \frac{(O_d – E_d)^2}{E_d}$$

In Excel, with observed counts in G2:G10 and expected counts in F2:F10:


H2 = (G2-F2)^2/F2  [copy down through H10]
I2 = SUM(H2:H10)

Degrees of freedom equals $k – 1 = 9 – 1 = 8$. Nine digit categories exist with one constraint (probabilities sum to 1). Benford’s probabilities are theoretically fixed — no parameters estimated from data — so no additional degrees of freedom are lost. PCAOB AS 2305.09 requires explicit df documentation in workpapers.

P-Value Calculation


J2 = CHISQ.DIST.RT(I2, 8)

This right-tail probability represents the chance of observing a chi-squared statistic at least as extreme as calculated under the null hypothesis.

Sample Size Validation

Cochran’s rule requires each expected count $E_d \geq 5$ for chi-squared approximation validity. With Benford’s minimum probability $P(9) = 0.0458$:

$$N_{\text{min}} = \frac{5}{0.0458} \approx 109$$

Populations below 250 transactions produce unstable p-values. The auditor adds validation:


K2 = IF(MIN(F2:F10)<5, "WARNING: Sample too small for chi-squared", "Sample size adequate")

Reproducible Synthetic Example

The companion workbook /workbooks/001-benford-excel-template.xlsx contains frozen synthetic data generated using the following method. The auditor can verify reproducibility by comparing workbook values to these specifications:

Sheet: SyntheticData

Column B (B2:B5001): Pre-generated log-normal Benford-compliant amounts
Column C (C2:C5001): Pre-generated uniform-digit non-compliant amounts

The workbook ships with static values rather than dynamic formulas to ensure reproducibility. Original generation used VBA with Randomize 42 seed. The Benford-compliant column B produces $\chi^2 = 6.83$, $p = 0.554$ (fails to reject). The uniform-digit column C produces $\chi^2 = 1247.6$, $p < 0.0001$ (strongly rejects).

Auditors verifying the companion workbook should confirm:

Cell I12 (Benford chi-squared) displays 6.83 ± 0.01
Cell J12 (Benford p-value) displays 0.554 ± 0.001
Cell I13 (Uniform chi-squared) displays 1247.6 ± 0.1
Chart output matches Figure 1 in workbook documentation

Claude Mid-Test Diagnostic

When the test rejects, the auditor prompts Claude to interpret cell-level deviations:


Benford test rejected (p=0.0023) for Accounts Payable. Cell residuals:
Digit 1: Obs=2843, Exp=3010, Residual=-3.04
Digit 2: Obs=1923, Exp=1761, Residual=3.86
Digit 5: Obs=652, Exp=792, Residual=-4.98
Digit 6: Obs=701, Exp=669, Residual=1.23
Digit 7: Obs=598, Exp=580, Residual=0.75
Digit 8: Obs=523, Exp=512, Residual=0.48
Digit 9: Obs=468, Exp=458, Residual=0.47

Identify top-3 deviation magnitudes. For each, select one interpretation:
- Selection bias (filtered extract)
- Process mixture (multiple populations)
- Round-number clustering
- Threshold avoidance
- Data manipulation

Format: Digit → Interpretation → Follow-up procedure

Example Claude response: “Digit 5 (-4.98) → Threshold avoidance → Review transactions near $5,000 and $50,000 limits. Digit 2 (+3.86) → Round-number clustering → Check for $2,000/$20,000 payment concentrations. Digit 1 (-3.04) → Selection bias → Verify extraction excluded no low-value transactions.”

Statistical Interpretation Framework

Non-Rejection Scenarios (p > 0.05)

The population’s digit distribution aligns with Benford’s Law. This result does not prove absence of anomalies. Small manipulations may not shift the distribution sufficiently. The auditor documents: “First-digit distribution testing detected no deviation from expected logarithmic pattern. No digit-based anomalies identified requiring follow-up under AS 2305.17.”

Rejection Scenarios (p ≤ 0.05)

The population deviates from Benford’s distribution. The auditor investigates five structured hypotheses:

Selection bias in extraction — Rerun on unfiltered population; document filter criteria
Mixed generating processes — Stratify by vendor type or transaction source; test each stratum
Round-number concentration — Apply round-number tests (Round-Number Bias and Threshold-Avoidance Pattern Detection: When Approval Limits Drive the Distribution); examine approval policies
Threshold-avoidance patterns — Plot amount histogram; identify gaps below limits
Potential manipulation — Extract high-residual digit transactions; perform substantive testing per AS 2401.67

Cell-level contributions $(O_d – E_d)^2 / E_d$ identify driving digits. Contributions exceeding $\chi^2_{0.95,1} = 3.84$ warrant individual investigation.

Visual Diagnostic Construction

Side-by-side column charts reveal patterns beyond statistics:

Chart 1: Observed vs Expected Frequencies

X-axis: Digits 1-9
Y-axis: Frequency (0 to 0.35)
Blue bars: Observed frequencies
Orange bars: Benford expected frequencies
Horizontal line at $p = 1/9 = 0.111$ for uniform reference

Systematic patterns suggest specific causes. Monotonic increase indicates filtering. Sawtooth patterns suggest rounding behavior.

Claude Post-Test Narrative Generation

After completing analysis, the auditor prompts Claude for workpaper drafting:


Draft workpaper for Benford test on ABC Corp Accounts Payable Q4-2025:
- N=8,742 transactions
- Chi-squared=14.82, p=0.063
- Largest deviations: Digit 5 (-2.31), Digit 1 (-1.89), Digit 9 (+1.67)

Structure:
1. Test description citing AS 2305.09 (analytical procedures)
2. Result interpretation (reject/fail-to-reject at α=0.05)
3. If rejected: Three alternative explanations without asserting cause
4. Recommended procedures per AS 2401.67 if manipulation suspected

Tone: Neutral technical description. No speculation beyond evidence.

Example Claude response: “Performed Benford’s Law first-digit test on ABC Corp Accounts Payable per AS 2305.09 analytical procedures. Sample of 8,742 transactions yielded chi-squared statistic of 14.82 with 8 degrees of freedom, corresponding to p-value of 0.063. Failed to reject null hypothesis at α=0.05 significance level. Distribution is marginally consistent with Benford’s Law. Digit 5 shows modest deficit (standardized residual -2.31), potentially indicating threshold avoidance near $5,000 approval limits. No follow-up procedures required under AS 2305.17 given non-rejection.”

Common False-Positive Patterns

Three patterns generate spurious Benford rejections:

Fixed-Payment Concentrations

Lease payments and maintenance contracts create digit spikes. A portfolio with $5,000 monthly leases shows excess digit-5 frequency. Resolution: Segregate fixed-recurring payments before testing variable transactions.

Policy-Threshold Clustering

Expense caps ($75 meals, $150 hotels) and approval limits ($10,000 supervisor, $50,000 director) create artificial patterns. Resolution: Stratify by policy tier; test within-tier variation.

Misidentified Column Testing

GL extracts may include reference numbers formatted as amounts. Testing “BatchID” columns produces meaningless rejections. Resolution: Validate column semantics before testing.

Integration with Analytical Workflow

Benford’s first-digit test provides population-wide screening. Subsequent tests target specific patterns:

Same-Same-Different in Excel: Detecting Identical-Value-Different-Date Patterns Across Vendor Files: Same-same-different matching detects vendor-employee overlap
Linear Regression for Outlier Detection in Excel: Building the Standardized-Residual Workpaper for Expense-Account Analytics: Regression residuals identify amount anomalies after controlling for factors
Round-Number Bias and Threshold-Avoidance Pattern Detection: When Approval Limits Drive the Distribution: Round-number tests capture behavioral manipulation
Date-Pattern Analysis: Weekend Postings, End-of-Period Clustering, and Holiday Anomalies: Journal-entry clustering reveals period-end manipulation

Each technique includes Claude augmentation at decision points. The toolkit enables systematic PCAOB AS 2401 fraud-risk coverage.

References

Foundational papers:

Newcomb, S. (1881). “Note on the Frequency of Use of the Different Digits in Natural Numbers.” American Journal of Mathematics, 4(1), 39-40.
Benford, F. (1938). “The Law of Anomalous Numbers.” Proceedings of the American Philosophical Society, 78(4), 551-572.
Hill, T.P. (1995). “A Statistical Derivation of the Significant-Digit Law.” Statistical Science, 10(4), 354-363.

Audit and forensic-accounting literature:

Nigrini, M.J. (2012). Benford’s Law: Applications for Forensic Accounting, Auditing, and Fraud Detection. Wiley, pp. 23-45, 89-112.
Durtschi, C., Hillison, W., & Pacini, C. (2004). “The Effective Use of Benford’s Law to Assist in Detecting Fraud in Accounting Data.” Journal of Forensic Accounting, 5(1), 17-34.

Audit standards:

PCAOB AS 2305.09 — Substantive Analytical Procedures (degrees of freedom documentation)
PCAOB AS 2401.67 — Consideration of Fraud in a Financial Statement Audit (response to indicated fraud risk)
AICPA AU-C 520.A7 — Analytical Procedures (investigation of significant differences)

Companion workbook: Available at DD-Tech-Lab repository: /workbooks/001-benford-excel-template.xlsx

Benford’s Law in Excel: Setting Up the First-Digit Test for Mass-Transaction Screening