the preceding articles in this sub-series modeled discrete-state sequences in discrete time. The implicit assumption — transactions occur at evenly-spaced moments — discards the most diagnostic feature of period-end audit work: the timing itself. A transaction stream that is uniform throughout the period but spikes in the last three business days carries information that a discrete-time Markov chain cannot represent without arbitrary time-bucketing. The same is true of inter-arrival times in customer-payment streams, vendor-disbursement clusters, and journal-entry posting density at quarter-end.
Continuous-time Markov chains (CTMCs) are the natural framework. State transitions occur at random times governed by exponential waiting-time distributions; the dynamics are summarized by a generator matrix $\mathbf{Q}$ rather than a transition matrix $\mathbf{P}$; the Poisson process is the canonical single-state example for transaction-arrival modeling. Non-homogeneous extensions (where the rate parameter varies with time) handle the close-cycle period-end spike directly. This article keeps the worked example on the Poisson special case because that is the most usable audit diagnostic, but it also shows where a genuine multi-state CTMC enters once the engagement needs to model transitions among posting regimes such as routine entries, estimates, and override-driven adjustments.
The framing aligns with PCAOB AS 2401 §60-67 (journal-entry timing analysis specifically targets period-end activity) and the AICPA Forensic Accounting Practice Aid for the period-end-spike forensic-investigation context.
The continuous-time Markov chain framework
A continuous-time Markov chain $\{X_t : t \geq 0\}$ on a finite state space $\mathcal{S}$ is characterized by the generator matrix $\mathbf{Q}$ of dimension $|\mathcal{S}| \times |\mathcal{S}|$ with the constraints:
$$Q_{ij} \geq 0 \quad (i \neq j), \qquad Q_{ii} = -\sum_{j \neq i} Q_{ij}$$
The off-diagonal $Q_{ij}$ is the rate (transitions per unit time) of $i \to j$ transitions. The diagonal $Q_{ii} = -\sum_{j \neq i} Q_{ij}$ is the negative of the total exit rate from state $i$.
Two consequences follow. First, the time spent in state $i$ before the next transition (the sojourn time) is exponentially distributed with rate $\lambda_i = -Q_{ii}$:
$$P(\text{sojourn time in } i > t) = e^{Q_{ii} t}$$
Second, conditional on a transition occurring out of state $i$, the destination state $j$ is sampled with probability $Q_{ij} / |Q_{ii}|$.
The transition-probability matrix at finite time $t$ relates to $\mathbf{Q}$ via the matrix exponential — the matrix-valued generalization of $e^{x}$ that converts a generator $\mathbf{Q}$ (rates) into a transition-probability matrix (probabilities). For scalars, $e^{at}$ tells you the multiplicative growth of a continuously-compounded process. For matrices, $e^{t\mathbf{Q}}$ does the same job in multiple dimensions simultaneously — it propagates the system’s state distribution forward by $t$ time units (Norris, 1997, §2.4):
$$\mathbf{P}(t) = e^{t \mathbf{Q}} = \sum_{n=0}^{\infty} \frac{(t \mathbf{Q})^n}{n!}$$
The audit team does not need to compute this by hand. scipy.linalg.expm does it directly from $\mathbf{Q}$ and $t$. This is the conceptual bridge between continuous-time and discrete-time formulations: the discrete-time transition matrix observed at sampling interval $\Delta t$ is $\mathbf{P}(\Delta t) = e^{\Delta t \mathbf{Q}}$. The CTMC framework includes the discrete-time framework as a special case sampled at integer multiples of any $\Delta t > 0$.
Minimal multi-state CTMC example
To make the generator matrix operational rather than decorative, consider a three-state posting-regime chain:
routine— standard operational journal flowestimate— accrual / reserve / close-cycle estimate entriesoverride— manual management adjustments requiring heightened review
import numpy as np
from scipy.linalg import expm
STATE_NAMES = ["routine", "estimate", "override"]
Q = np.array([
[-0.45, 0.35, 0.10], # routine exits mostly into estimate traffic, occasionally into overrides
[ 0.30, -0.50, 0.20], # estimate entries often resolve back to routine but can escalate
[ 0.25, 0.35, -0.60], # override entries eventually resolve, but at a slower exit rate
])
assert np.allclose(Q.sum(axis=1), 0.0)
P_1day = expm(Q * 1.0)
print("One-day transition matrix:")
print(np.round(P_1day, 3))
This is the point where the full CTMC machinery matters. The matrix exponential turns the generator into a one-day transition matrix, and the diagonal magnitudes determine expected time spent in each posting regime before the next jump. Concretely, the expected sojourn time in state $i$ before the next transition is $1 / |Q_{ii}|$ — so for the routine state with $Q_{00} = -0.45$, the expected dwell time is $1 / 0.45 \approx 2.22$ days before the next category change. Estimate routine, override, and accrual posting regimes drift apart this fast under normal conditions; an entity whose observed sojourn times systematically differ from the documented baseline is sending a regime-shift signal worth investigating. The Poisson spike diagnostic below is the simpler special case where only the arrival count process matters.
The Kolmogorov forward and backward equations describe how $\mathbf{P}(t)$ evolves:
$$\frac{d\mathbf{P}(t)}{dt} = \mathbf{P}(t) \mathbf{Q} \quad \text{(forward)}, \qquad \frac{d\mathbf{P}(t)}{dt} = \mathbf{Q} \mathbf{P}(t) \quad \text{(backward)}$$
with initial condition $\mathbf{P}(0) = \mathbf{I}$.
The Poisson process
The simplest CTMC is the Poisson process: a counting process $\{N_t : t \geq 0\}$ with $N_0 = 0$, independent increments, and increment distribution:
$$P(N_t – N_s = k) = \frac{(\lambda (t – s))^k e^{-\lambda (t – s)}}{k!} \quad \text{for } k = 0, 1, 2, \ldots$$
The single parameter $\lambda$ is the arrival rate (events per unit time). Inter-arrival times are independently exponentially distributed with mean $1/\lambda$. This is the canonical model for transaction arrivals when the rate is constant over time.
import numpy as np
import pandas as pd
from scipy.stats import expon, kstest, poisson
def fit_homogeneous_poisson(timestamps: np.ndarray) -> dict:
"""Fit a homogeneous Poisson process to transaction arrival times.
Returns:
lambda_hat — estimated rate (events per unit time)
ks_stat — Kolmogorov-Smirnov GOF statistic on inter-arrival times
ks_p — p-value for the KS test against Exponential(lambda_hat)
"""
sorted_t = np.sort(timestamps)
inter_arrivals = np.diff(sorted_t)
if inter_arrivals.size < 2:
return {"warning": "insufficient_data", "n_observations": int(sorted_t.size)}
lambda_hat = float(1.0 / inter_arrivals.mean())
# KS test: inter-arrivals should be Exponential(lambda_hat) under H0
ks_stat, ks_p = kstest(inter_arrivals, "expon", args=(0, 1.0 / lambda_hat))
return {
"lambda_hat": lambda_hat,
"n_arrivals": int(sorted_t.size),
"mean_inter_arrival": float(inter_arrivals.mean()),
"ks_statistic": float(ks_stat),
"ks_p_value": float(ks_p),
"reject_homogeneous_poisson": ks_p < 0.05,
}
The KS-test interpretation: failure to reject is not strong evidence of Poisson fit; it is absence of detected lack-of-fit. With small samples the test is underpowered against modest departures from exponential inter-arrivals.
Non-homogeneous Poisson processes
When the arrival rate varies with time (e.g., the close-cycle period-end spike), the non-homogeneous Poisson process generalizes by replacing the constant rate $\lambda$ with a rate function $\lambda(t)$:
$$P(N_t – N_s = k) = \frac{(\Lambda(s, t))^k e^{-\Lambda(s, t)}}{k!}, \quad \Lambda(s, t) = \int_s^t \lambda(u) \, du$$
The cumulative-rate function $\Lambda(s, t)$ replaces the constant $\lambda(t – s)$ from the homogeneous case. A piecewise-constant intensity model partitions the time interval into segments and estimates $\lambda$ separately within each segment.
def fit_piecewise_constant_intensity(timestamps: np.ndarray, breakpoints: list[float]) -> dict:
"""Fit a piecewise-constant intensity Poisson process.
breakpoints: list of segment boundaries (e.g., [0, 25, 30] for a 30-day period
split into mid-period [0, 25] and close-cycle [25, 30]).
Returns per-segment rate estimates.
"""
segments = []
for i in range(len(breakpoints) - 1):
start, end = breakpoints[i], breakpoints[i + 1]
in_segment = timestamps[(timestamps >= start) & (timestamps < end)]
duration = end - start
rate = float(in_segment.size / duration) if duration > 0 else 0.0
segments.append({
"segment": (start, end),
"duration": float(duration),
"n_arrivals": int(in_segment.size),
"rate_per_unit_time": rate,
})
return {"segments": segments, "n_segments": len(segments)}
The end-of-period spike diagnostic
The most operationally useful single-state CTMC application in audit is the end-of-period spike diagnostic: under the null hypothesis that arrivals follow a homogeneous Poisson process across the full period, the fraction of transactions in the last $w$ days (the spike window) should equal $w / T$ (where $T$ is total period length). Reject if observed fraction exceeds the upper-tail of the binomial distribution.
from scipy.stats import binomtest
def end_of_period_spike_diagnostic(timestamps: np.ndarray, period_start: float,
period_end: float, spike_window: float = 3.0,
alpha: float = 0.05) -> dict:
"""Test whether the last spike_window days of the period concentrate
disproportionately many transactions vs. uniform-rate expectation.
Under H0 (homogeneous Poisson over full period), the fraction of transactions
in the spike window equals spike_window / period_length.
"""
period_length = period_end - period_start
if period_length <= 0 or spike_window > period_length:
return {"warning": "invalid_window", "period_length": period_length}
in_window = int(((timestamps >= period_end - spike_window) & (timestamps <= period_end)).sum())
total = int(((timestamps >= period_start) & (timestamps <= period_end)).sum())
if total == 0:
return {"warning": "no_transactions_in_period", "n_transactions": 0}
expected_fraction = spike_window / period_length
observed_fraction = in_window / total
test_result = binomtest(in_window, total, expected_fraction, alternative="greater")
return {
"n_transactions_total": total,
"n_transactions_in_spike_window": in_window,
"spike_window_days": spike_window,
"period_length_days": period_length,
"expected_fraction": expected_fraction,
"observed_fraction": observed_fraction,
"binomial_test_statistic": float(in_window),
"binomial_test_p_value": float(test_result.pvalue),
"spike_detected": test_result.pvalue < alpha,
}
Worked example: synthetic 90-day transaction stream
The companion code generates a synthetic 90-day transaction-timestamp stream with a normal mid-period rate and an injected three-day end-of-period spike at 4× the baseline rate.
PERIOD_LENGTH_DAYS = 90
BASELINE_RATE_PER_DAY = 30.0
SPIKE_WINDOW_DAYS = 3
SPIKE_RATE_MULTIPLIER = 4.0
def generate_transaction_stream_with_spike(period_length: float, baseline_rate: float,
spike_window: float, spike_multiplier: float,
seed: int = 42) -> np.ndarray:
"""Synthetic Poisson arrivals: baseline rate over [0, period_length - spike_window),
then spike_multiplier * baseline_rate over [period_length - spike_window, period_length]."""
rng = np.random.default_rng(seed)
# Baseline segment. Pass the continuous-rate expectation (rate * duration) directly
# as the Poisson lambda; do NOT coerce to int beforehand because that biases the
# count distribution by truncating the rate magnitude before sampling.
baseline_duration = period_length - spike_window
n_baseline = int(rng.poisson(baseline_rate * baseline_duration))
baseline_arrivals = rng.uniform(0, baseline_duration, size=n_baseline)
# Spike segment, same parameterization discipline: lambda = rate * duration.
spike_rate = baseline_rate * spike_multiplier
n_spike = int(rng.poisson(spike_rate * spike_window))
spike_arrivals = rng.uniform(period_length - spike_window, period_length, size=n_spike)
return np.sort(np.concatenate([baseline_arrivals, spike_arrivals]))
# Generate stream
timestamps = generate_transaction_stream_with_spike(
PERIOD_LENGTH_DAYS, BASELINE_RATE_PER_DAY, SPIKE_WINDOW_DAYS, SPIKE_RATE_MULTIPLIER
)
print(f"Generated {len(timestamps)} synthetic transactions over {PERIOD_LENGTH_DAYS} days")
# Fit homogeneous Poisson and check goodness of fit
hpp_result = fit_homogeneous_poisson(timestamps)
print(f"\nHomogeneous Poisson fit: λ_hat = {hpp_result['lambda_hat']:.3f} per day")
print(f" KS test p-value: {hpp_result['ks_p_value']:.6f}")
print(f" Reject homogeneous: {hpp_result['reject_homogeneous_poisson']}")
# Fit piecewise-constant intensity (mid-period vs spike segment)
nhpp_result = fit_piecewise_constant_intensity(
timestamps, breakpoints=[0, PERIOD_LENGTH_DAYS - SPIKE_WINDOW_DAYS, PERIOD_LENGTH_DAYS]
)
print(f"\nPiecewise-constant intensity:")
for seg in nhpp_result["segments"]:
print(f" Segment {seg['segment']}: {seg['n_arrivals']} arrivals, "
f"rate = {seg['rate_per_unit_time']:.3f}/day")
# End-of-period spike diagnostic
spike_result = end_of_period_spike_diagnostic(
timestamps, period_start=0.0, period_end=PERIOD_LENGTH_DAYS,
spike_window=SPIKE_WINDOW_DAYS, alpha=0.05
)
print(f"\nEnd-of-period spike diagnostic:")
print(f" Total transactions: {spike_result['n_transactions_total']}")
print(f" In last {SPIKE_WINDOW_DAYS} days: {spike_result['n_transactions_in_spike_window']}")
print(f" Expected fraction (uniform-rate null): {spike_result['expected_fraction']:.4f}")
print(f" Observed fraction: {spike_result['observed_fraction']:.4f}")
print(f" Binomial test p-value: {spike_result['binomial_test_p_value']:.6e}")
print(f" Spike detected: {spike_result['spike_detected']}")
With seed=42, the deterministic output:
- Generates ~2,610 baseline transactions (Poisson(30 × 87) ≈ 2,610) and ~360 spike transactions (Poisson(120 × 3) ≈ 360); total ≈ 2,970
- The KS test against homogeneous Poisson rejects strongly: inter-arrival distribution does not match the constant-rate exponential
- Piecewise-constant intensity correctly identifies the rate ratio (~4× elevated in the spike window)
- The end-of-period spike binomial test rejects decisively: the spike window contains ~12% of transactions vs. expected ~3.3%
The diagnostic correctly flags the injected spike. On a clean baseline (no injected spike), all three tests fail to reject, consistent with the homogeneous-Poisson null.
Workpaper mapping: from spike rejection to substantive procedure
When the spike diagnostic rejects on an entity-period, the audit team’s escalation maps to specific PCAOB AS 2401 procedures and workpaper assertions:
- Occurrence and cutoff (AS 2401 §60). Select the specific transactions posted in the spike window. For each, inspect the underlying source document (shipping document, customer acknowledgment, vendor invoice, GRN) and confirm the earnings process or expense-incurrence event was complete on or before the recorded posting date.
- Existence (AS 2401 §A.5). When the spike-window concentration falls predominantly in revenue accounts, confirm receivables with a sample of customers tied to the spike-window invoices; the spike-window subset is overweighted in the confirmation sample relative to its population proportion.
- Management override of controls (AS 2401 §11). When the spike-window concentration is materially higher than the entity’s own historical pattern and the affected transactions disproportionately involve estimates or judgment-based postings (versus routine system-generated entries), escalate to management-override testing: inquire of the controller about the close-cycle review trail, inspect journal-entry approval evidence, and test the entity’s period-end reconciliation control.
The diagnostic does not by itself identify which transactions are misstated; it concentrates substantive-testing scope on the transactions where the prior probability of misstatement is materially elevated.
Calibration against entity-specific close-cycle baseline
The end-of-period spike diagnostic above compares observed activity against a uniform-rate null. For most production entities, this generates false positives — every entity has a legitimate close-cycle structural shift in the last 3-5 business days, and the diagnostic flags it as anomalous against the generic null.
The right calibration uses the entity’s own historical close-cycle pattern as the null. Compute the historical fraction of transactions in the last $w$ days across $K$ prior periods; use that empirical fraction (with its standard error) as the comparison baseline rather than the generic $w/T$. The diagnostic then flags only entities whose current close-cycle pattern is anomalous relative to their own history.
def historical_close_cycle_baseline(historical_period_data: list[dict]) -> dict:
"""Compute mean and stderr of close-cycle fraction across K prior periods.
historical_period_data: list of {timestamps: array, period_start: float,
period_end: float, spike_window: float} dicts.
"""
fractions = []
for period in historical_period_data:
diag = end_of_period_spike_diagnostic(
period["timestamps"], period["period_start"], period["period_end"],
period["spike_window"]
)
if "observed_fraction" in diag:
fractions.append(diag["observed_fraction"])
else:
# Production artifact should log the dropped period; the companion script prints a count.
continue
fractions = np.array(fractions)
return {
"n_historical_periods": len(fractions),
"mean_close_cycle_fraction": float(fractions.mean()),
"std_close_cycle_fraction": float(fractions.std(ddof=1)) if len(fractions) > 1 else 0.0,
"stderr": float(fractions.std(ddof=1) / np.sqrt(len(fractions))) if len(fractions) > 1 else 0.0,
}
The current-period close-cycle fraction is then tested against the historical mean using a t-test or — more conservatively — an empirical-quantile test against the historical distribution. Only periods whose close-cycle fraction exceeds (say) the historical 95th percentile are flagged for investigation.
Reference points in the published prosecution record
End-of-period transaction-timing manipulation is one of the most frequently-prosecuted financial-statement-fraud patterns. Three reference points illustrate the audit relevance of the methodology:
- Lehman Brothers — Repo 105 (2010). Examiner’s Report of Anton R. Valukas in In re Lehman Brothers Holdings Inc., et al., U.S. Bankruptcy Court for the Southern District of New York, Case No. 08-13555 (March 2010 examiner’s report); subsequent SEC enforcement and 2012 settlement against Ernst & Young auditors. The Repo 105 transactions concentrated heavily in the final business days of quarter-end and year-end — exactly the spike pattern the end-of-period diagnostic in this article detects. The Valukas report documents the timing concentration in technical detail; the methodology in this article would have flagged it as a sustained anomalous spike across multiple reporting periods.
- Wirecard AG (2020). Munich I Public Prosecutor’s Office criminal proceedings against Markus Braun and co-defendants commenced December 2022 at the Munich Regional Court (Landgericht München I); BaFin administrative enforcement; KPMG special-audit report June 2020. Wirecard’s purported “escrow” balances were concentrated in quarter-end disclosures with a sharp drop-off in mid-period activity — a timing pattern the non-homogeneous Poisson framework in this article identifies directly when applied to disclosed-balance refresh events.
- Enron — quarter-end SPE transactions (2001-2002). United States v. Skilling, U.S. District Court for the Southern District of Texas, conviction May 2006 (subsequently affirmed in part at Skilling v. United States, 561 U.S. 358 (2010)); SEC parallel civil enforcement through 2008. Enron’s special-purpose-entity transactions clustered in the final days of each quarter to manage reported leverage; the end-of-period spike diagnostic applied to SPE-transaction timestamps would have produced sustained quarter-on-quarter rejections.
In each case, the audit team’s workpaper trail for transaction-timing diagnostics — even if those diagnostics did not catch the fraud in real time — would have provided documented evidence that the timing pattern was investigated and (incorrectly) explained. The methodology in this article supplies the documentary record that PCAOB AS 2401 §60-67 audits require even when fraud is not present.
Compound Poisson processes
When each transaction arrival carries a random magnitude (transaction amount), the compound Poisson process generalizes:
$$Y_t = \sum_{i=1}^{N_t} Z_i$$
where $\{N_t\}$ is a Poisson process and $\{Z_i\}$ is an i.i.d. sequence of jump magnitudes. The total amount $Y_t$ over period $[0, t]$ has expected value $E[Y_t] = \lambda t \cdot E[Z]$ and variance $\text{Var}(Y_t) = \lambda t \cdot E[Z^2]$.
For audit applications, compound Poisson framing combines arrival-time analysis (this article) with amount-distribution analysis (Two-Stage Screening: Benford's Law as a Stationary Distribution Combined With First-Order Markov Tests‘s Benford application) into a unified diagnostic. The technical machinery is largely the same as homogeneous Poisson with the added jump-size distribution; deploy when both timing and amount information are diagnostic.
Failure modes and defenses
Three patterns recur in CTMC and Poisson-process deployment to audit settings.
Calibration against entity-specific baseline (revisited). The diagnostic loses operational value if calibrated against a generic null. Every entity has a real close-cycle pattern; flagging it on every entity is noise. Defense: always calibrate against the entity’s own prior periods, and report results as deviations from the entity-specific historical pattern.
Independent-arrivals assumption. Real transaction streams are not independent. End-of-day batching, payment-system cutoffs, and ERP-system upload schedules all produce arrival clustering at sub-day timescales that violate Poisson independence. Defense: aggregate to a coarser timescale (daily, hourly) before applying Poisson methods; or use Hawkes (self-exciting) point processes that explicitly model clustering.
Time-zone and business-day boundary issues. Transaction timestamps are typically recorded in the ERP system’s local time zone but compared against close-cycle dates that may be defined in a different time zone (parent-company close calendar). Defense: normalize all timestamps to a single reference time zone before analysis, and explicitly document the time zone in the workpaper.
When the spike diagnostic is compelling — and when a Hawkes alternative is required
The end-of-period spike diagnostic carries audit weight under one condition: the underlying transaction-generation process is approximately Poisson at the relevant timescale, meaning arrivals can be treated as independent within the period. When that assumption holds — which is approximately true for most ERP-generated mid-size-entity transaction streams at daily aggregation — an elevated last-$w$-days concentration relative to the entity’s own historical pattern is statistically meaningful evidence that warrants scoped substantive testing.
When the assumption fails — when transaction arrivals exhibit self-excitation (one large posting triggering a cascade of related entries within minutes, or end-of-day batching producing clustered timestamps that aren’t truly independent) — the spike diagnostic over-rejects on clean entities and under-rejects on entities whose anomalous activity is itself clustered. The right escalation in that case is a Hawkes self-exciting point process: model the conditional intensity $\lambda(t \mid \mathcal{H}_{t})$ directly, fit on the entity’s prior periods, and use the deviation between observed and expected clustering structure as the diagnostic.
The operational rule for the audit team: deploy the Poisson spike diagnostic by default at daily aggregation, and escalate to Hawkes only when the entity has documented batching practices (high-frequency trading desks, payment-system cutoffs that concentrate posting at specific intra-day moments) that materially violate the independence assumption. At daily granularity for most engagements, the simpler diagnostic is calibrated and the more complex one offers no incremental signal.
Authority:
CTMC and Poisson-process theory:
- Norris, J.R. (1997). Markov Chains. Cambridge Series in Statistical and Probabilistic Mathematics, Ch. 2-3 (continuous-time framework).
- Ross, S.M. (2019). Introduction to Probability Models (12th ed.). Academic Press, Ch. 5 (Poisson processes), Ch. 6 (continuous-time Markov chains).
- Çinlar, E. (1975). Introduction to Stochastic Processes. Prentice-Hall.
- Daley, D.J., & Vere-Jones, D. (2003). An Introduction to the Theory of Point Processes, Volume I (2nd ed.). Springer.
- Cox, D.R., & Lewis, P.A.W. (1966). The Statistical Analysis of Series of Events. Methuen. (Classical reference for arrival-time goodness-of-fit testing.)
- Hawkes, A.G. (1971). “Spectra of Some Self-Exciting and Mutually Exciting Point Processes.” Biometrika, 58(1), 83-90. (Self-exciting point processes for the clustering case.)
Audit standards and forensic literature:
- PCAOB AS 2401 — Consideration of Fraud in a Financial Statement Audit, §60-67 (journal-entry timing analysis).
- AICPA. Forensic Accounting Practice Aid (period-end-spike forensic-investigation framing).
Companion code on GitHub
Runnable Python artifact reproducing this article’s worked example end-to-end under seed=42: stochastic_markov/010_continuous_time_markov.py in noahrgreen/dd-tech-lab-companion.
Clone the repo and run with python stochastic_markov/010_continuous_time_markov.py.
