Markov Mixture Models for Round-Tripping and Lapping Detection

Round-tripping and lapping share a structural feature that breaks both First-Order Markov Modeling for Transaction-Stream Analysis in Audit‘s first-order Markov framework and Hidden Markov Models for Earnings-Management Regime Detection in Public-Company Financials‘s two-regime Hidden Markov Model. The transaction population is heterogeneous: most counterparties (or customer accounts, or transaction days) follow ordinary business-cycle dynamics, while a small subset follows a different generative process — one that produces cycles, lapping chains, or amount-matching patterns. Fitting a single transition matrix to the whole population averages the two processes together and dilutes both signals. Fitting a single HMM treats heterogeneity as a temporal property of one sequence, which it isn’t.

The right framework is a finite mixture of Markov chains: model the population as $K$ latent components, each with its own transition matrix, and let the data assign each counterparty (or account, or transaction day) a posterior probability of belonging to each component. Components that emerge with cyclic transition structure correspond to round-tripping candidates. Components with strict-sequencing structure across customer accounts correspond to lapping candidates. The posterior probabilities convert a diffuse anomaly screen into a review queue the engagement team can actually scope.

This framing aligns with PCAOB AS 2410 (Related Parties) for the round-tripping context and AS 2401 §A.5 for the lapping reference.

The mixture model

Let $\{s_i\}_{i=1}^N$ denote $N$ observed sequences (one per counterparty), each $s_i = (s_{i,1}, s_{i,2}, \ldots, s_{i,T_i})$ a discrete-state Markov sequence. The mixture-of-Markov-chains model assumes each sequence is drawn from one of $K$ latent components:

$$P(s_i \mid \theta) = \sum_{k=1}^K \pi_k \cdot P(s_i \mid P^{(k)})$$

where $\pi_k$ is the prior probability of component $k$ ($\sum_k \pi_k = 1$) and $P^{(k)}$ is the transition matrix specific to component $k$. The likelihood under component $k$ factorizes via the Markov property:

$$P(s_i \mid P^{(k)}) = P(s_{i,1}) \prod_{t=2}^{T_i} P^{(k)}_{s_{i,t-1}, s_{i,t}}$$

The full parameter set is $\theta = (\pi, \{P^{(k)}\}_{k=1}^K)$ with $K(|S|^2 – |S|) + (K-1)$ free parameters.

EM estimation

The latent component assignments $z_i \in \{1, \ldots, K\}$ are unobserved — for each counterparty we don’t know which of the $K$ types they belong to. Expectation-Maximization (Dempster, Laird & Rubin, 1977) iterates between two steps that converge to a locally-best guess. The E-step computes, for each counterparty, the probability that they belong to each component given the current parameter estimates. The M-step then updates the parameters as if those probabilities were the truth, weighted appropriately. Each iteration is guaranteed to improve the overall fit (no worse than the prior iteration), and the loop continues until improvement stalls.

E-step. Compute the responsibility $\gamma_{i,k}$ — the probability that counterparty $i$ belongs to component $k$, given the data and the current parameter guess. The audit interpretation: a $\gamma_{i, \text{dirty}}$ of 0.93 means the model is 93% confident this counterparty’s posting pattern looks like the round-tripping component. The formula is just Bayes’ rule applied to the mixture:

$$\gamma_{i,k} = \frac{\pi_k^{(\text{old})} \cdot P(s_i \mid P^{(k), \text{old}})}{\sum_{j=1}^K \pi_j^{(\text{old})} \cdot P(s_i \mid P^{(j), \text{old}})}$$

M-step. Re-estimate parameters via responsibility-weighted maximum likelihood:

$$\hat{\pi}_k = \frac{1}{N} \sum_{i=1}^N \gamma_{i,k}, \qquad \hat{P}^{(k)}_{ab} = \frac{\sum_{i=1}^N \gamma_{i,k} \cdot N_i(a, b)}{\sum_{i=1}^N \gamma_{i,k} \cdot N_i(a, \cdot)}$$

where $N_i(a, b)$ is the count of $a \to b$ transitions in sequence $i$ and $N_i(a, \cdot) = \sum_b N_i(a, b)$.

EM monotonically increases the observed-data log-likelihood but converges only to a local maximum. Multi-restart initialization (10-20 random restarts) is the standard mitigation. The implementation below is pedagogical — it uses explicit Python loops over sequences and components so the responsibility math is auditable line-by-line. A production implementation would vectorize the E-step via np.einsum over a stacked counts tensor and would compute the transition-count statistics once per sequence rather than per restart; in practice that produces a material speedup on larger counterparty populations, but the algebra is identical. The likelihood expression also suppresses the initial-state factor $P(s_{i,1})$ inside log_likelihood_under_P: for the synthetic 60-100 step sequences used here that term contributes one additive constant per sequence, so it does not change the ranked responsibilities materially. If an engagement needs exact finite-sample likelihoods, include an empirical or Dirichlet-smoothed initial-state vector alongside each component.


import numpy as np

def transition_counts(sequence: list[int], n_states: int) -> np.ndarray:
    """Return the transition-count matrix N[a, b] for a single sequence."""
    N = np.zeros((n_states, n_states), dtype=int)
    for prev, curr in zip(sequence[:-1], sequence[1:]):
        N[prev, curr] += 1
    return N

def log_likelihood_under_P(N: np.ndarray, P: np.ndarray) -> float:
    """log P(sequence | P) using transition counts; assumes initial-state factor is constant."""
    with np.errstate(divide="ignore"):
        log_P = np.where(P > 0, np.log(P), -np.inf)
    return float((N * log_P).sum())

def fit_markov_mixture(sequences: list[list[int]], n_states: int, K: int,
                        max_iter: int = 100, tol: float = 1e-5,
                        n_restarts: int = 10, seed: int = 42) -> dict:
    """EM for a finite mixture of K first-order Markov chains."""
    rng = np.random.default_rng(seed)
    N_seq = len(sequences)
    counts = [transition_counts(s, n_states) for s in sequences]

    best_ll, best_params = -np.inf, None
    for restart in range(n_restarts):
        # Random init: pi uniform; P[k] random row-stochastic
        pi = np.ones(K) / K
        P = rng.dirichlet(np.ones(n_states), size=(K, n_states))  # K x n_states x n_states
        prev_ll = -np.inf

        for it in range(max_iter):
            # E-step: log-responsibility (numerically stable)
            log_resp = np.zeros((N_seq, K))
            for k in range(K):
                for i in range(N_seq):
                    log_resp[i, k] = np.log(pi[k] + 1e-300) + log_likelihood_under_P(counts[i], P[k])
            # Normalize per row via log-sum-exp
            log_norm = np.logaddexp.reduce(log_resp, axis=1, keepdims=True)
            log_resp -= log_norm
            resp = np.exp(log_resp)

            ll = float(log_norm.sum())
            if abs(ll - prev_ll) < tol:
                break
            prev_ll = ll

            # M-step
            pi = resp.mean(axis=0)
            for k in range(K):
                weighted_counts = sum(resp[i, k] * counts[i] for i in range(N_seq))
                row_sums = weighted_counts.sum(axis=1, keepdims=True)
                P[k] = np.where(row_sums > 0, weighted_counts / row_sums, 1.0 / n_states)

        if ll > best_ll:
            best_ll, best_params = ll, {"pi": pi, "P": P, "responsibilities": resp,
                                          "log_likelihood": ll, "n_iterations": it + 1}

    return best_params

The bundle’s companion artifact supplies the production-style vectorized implementation (np.einsum over the stacked counts tensor) in one executable script, so the article itself keeps only the slower pedagogical version needed to audit the algebra line by line. The contraction at the heart of that vectorized version is:


with np.errstate(divide="ignore"):
    log_P = np.where(P > 0, np.log(P), -np.inf)   # K x S x S
log_resp = np.log(pi + 1e-300)[None, :] + np.einsum("nab,kab->nk", counts, log_P)
weighted_counts = np.einsum("nk,nab->kab", resp, counts)

counts is the stacked sequence-by-state-by-state tensor (N x S x S), so the two einsum calls reproduce the same E-step and M-step numerators as the explicit loops above without changing the estimator itself.

Component-count selection

The number of components $K$ is chosen via the Bayesian Information Criterion (Schwarz, 1978):

$$\text{BIC}(K) = -2 \log L_K + p_K \log N$$

where $p_K = K(|S|^2 – |S|) + (K – 1)$ is the parameter count and $N$ is the number of sequences. The parameter count comes from: each of the $K$ transition matrices has $|S| \times (|S| – 1)$ free entries — each row of each matrix has $|S|$ entries but must sum to 1, so only $|S|-1$ are free — plus $K – 1$ mixture weights (the $K$-th weight is fixed by the constraint that they sum to 1). The BIC formula trades model fit (the $-2 \log L_K$ term, which rewards bigger $K$ values that fit the data better) against complexity (the $p_K \log N$ term, which penalizes bigger $K$ values that add parameters). The audit context strongly favors parsimony — a false-positive component flagged on a clean entity triggers wasted substantive procedures and possibly an over-scoped AS 2410 related-party review, both of which carry real engagement-hour cost — so the BIC penalty is appropriate. For most audit applications, $K \in \{2, 3\}$ covers the practical range; $K = 2$ (clean + anomalous) is the default. For smaller counterparty populations, that penalty can be aggressive; if the engagement team sees BIC repeatedly collapsing borderline two-component fits into $K = 1$, the companion artifact is the place to compare BIC to ICL or holdout log-likelihood before changing the workpaper default.


def select_K(sequences: list[list[int]], n_states: int,
              K_candidates: list[int] = [2, 3, 4, 5]) -> tuple[int, dict]:
    """Fit mixtures for each K and return the BIC-selected fit."""
    N_seq = len(sequences)
    best_K, best_bic, best_fit = None, np.inf, None
    for K in K_candidates:
        fit = fit_markov_mixture(sequences, n_states, K)
        p_K = K * (n_states ** 2 - n_states) + (K - 1)
        bic = -2 * fit["log_likelihood"] + p_K * np.log(N_seq)
        if bic < best_bic:
            best_K, best_bic, best_fit = K, bic, fit
    return best_K, best_fit

Round-tripping signature: cycle detection from $P^{(k)}$

A round-tripping component’s transition matrix has elevated probability mass on cycles — paths in the transition graph that return to the starting state in $< |S|$ steps. The second-largest eigenvalue modulus of $P^{(k)}$ encodes the mixing rate — how quickly a chain forgets its starting state and settles into long-run behavior. The plain-English version: if you start tracking a counterparty’s posting pattern at some state and ask “how many steps until I lose track of where they started?”, a fast-mixing chain forgets quickly (you’re effectively randomly placed within a handful of steps); a slow-mixing chain remembers the starting state for a long time, which is the signature of a persistent cycle that keeps revisiting the same sequence. Numerically, every row-stochastic matrix has a dominant eigenvalue exactly equal to 1 (one direction in which the chain doesn’t decay — the stationary distribution). The audit signal lives in the next eigenvalue: how close the second-largest modulus sits to that unit root. For a fast-mixing baseline chain it sits well below 1; for a chain with persistent cyclic structure it approaches 1 (Norris, 1997, §1.8). A full spectral analysis would refine the period of the dominant cycle and could distinguish multi-period from single-period cyclic structure, but for the audit-triage threshold the second-largest modulus alone is sufficient. The code below returns exactly that quantity; the prose throughout this article reflects that scope.


def cycle_dominance_score(P: np.ndarray) -> float:
    """Returns the magnitude of the second-largest eigenvalue of P.

    For a baseline first-order chain, this is typically well below 1 (fast mixing).
    For a chain with strong cyclic structure, this is near 1 (slow mixing,
    persistent cycles). Audit interpretation: > 0.9 warrants investigation.
    """
    eigenvalues = np.linalg.eigvals(P)
    sorted_mag = np.sort(np.abs(eigenvalues))[::-1]
    return float(sorted_mag[1])  # second-largest by modulus

The interpretation is operational: components with cycle_dominance_score > 0.9 are flagged for round-tripping investigation. Components with the cycle-dominance score below the baseline-chain reference receive standard analytical procedures.

Worked example 1: synthetic 500-counterparty revenue-cycle round-tripping

Two worked examples follow. The first uses an abstract revenue-cycle round-tripping pattern to demonstrate the mixture-of-Markov-chains mechanics on a clean five-state ledger. The second walks a more textured real-world scheme — merchant cash advance round-tripping — where the same machinery applies and the cyclic signature has direct enforcement and accounting-policy consequences.

The companion code generates 500 synthetic counterparties, 480 of which follow a clean revenue-cycle baseline and 20 of which exhibit round-tripping (cyclic Revenue → AR → Cash → COGS → Revenue → …).


STATES = ['Cash', 'AR', 'Revenue', 'COGS', 'Inventory']
N_STATES = len(STATES)
N_CLEAN = 480
N_DIRTY = 20
SEQ_LENGTH = 100

# Clean baseline transition matrix (matches First-Order Markov Modeling for Transaction-Stream Analysis in Audit worked example)
P_clean = np.array([
    [0.10, 0.10, 0.05, 0.05, 0.70],
    [0.60, 0.05, 0.30, 0.03, 0.02],
    [0.10, 0.70, 0.05, 0.10, 0.05],
    [0.05, 0.02, 0.03, 0.15, 0.75],
    [0.20, 0.05, 0.10, 0.55, 0.10],
])
# Round-tripping component: heavy on Revenue → AR → Cash → COGS → Revenue cycle
P_dirty = np.array([
    [0.05, 0.05, 0.05, 0.80, 0.05],
    [0.85, 0.02, 0.10, 0.02, 0.01],
    [0.05, 0.85, 0.05, 0.03, 0.02],
    [0.05, 0.05, 0.80, 0.05, 0.05],
    [0.20, 0.20, 0.20, 0.20, 0.20],
])
assert np.allclose(P_clean.sum(axis=1), 1.0) and np.allclose(P_dirty.sum(axis=1), 1.0)

def sample_sequence(P: np.ndarray, length: int, rng) -> list[int]:
    n_states = P.shape[0]
    seq = [int(rng.integers(n_states))]
    for _ in range(length - 1):
        seq.append(int(rng.choice(n_states, p=P[seq[-1]])))
    return seq

rng = np.random.default_rng(42)
sequences = [sample_sequence(P_clean, SEQ_LENGTH, rng) for _ in range(N_CLEAN)] + \
            [sample_sequence(P_dirty, SEQ_LENGTH, rng) for _ in range(N_DIRTY)]
truth = [0] * N_CLEAN + [1] * N_DIRTY  # ground-truth component label

# BIC-select K, fit, and rank counterparties by round-tripping component posterior
selected_K, fit = select_K(sequences, N_STATES, K_candidates=[2, 3])
print(f"BIC-selected K: {selected_K}")
print(f"Cycle-dominance scores per component: {[cycle_dominance_score(P) for P in fit['P']]}")

# Identify the round-tripping component (highest cycle-dominance score)
cycle_scores = np.array([cycle_dominance_score(P) for P in fit['P']])
dirty_component = int(np.argmax(cycle_scores))
posterior_dirty = fit["responsibilities"][:, dirty_component]

# Top-25 ranked focus list
ranked = np.argsort(posterior_dirty)[::-1]
top_25 = ranked[:25]
true_dirty_in_top_25 = sum(1 for idx in top_25 if truth[idx] == 1)
print(f"True round-trippers in top 25 ranked: {true_dirty_in_top_25} / 20 actual dirty")

With seed=42, the BIC-selected $K = 2$ (recovering the true component count). The cycle-dominance score for the round-tripping component is near 0.95; the clean component is near 0.40. The top-25 ranked focus list captures most of the 20 true round-trippers (typically 17-19 out of 20), with the few false positives being marginal-posterior cases that warrant individual investigation.

Operationally, the posterior-ranked list becomes a workpaper triage table rather than an abstract anomaly score. A counterparty with posterior round-tripping probability above 0.80 moves into the primary substantive-testing set; 0.50-0.80 becomes the secondary review set that receives counterparty confirmation, cash-settlement tracing, and related-party screening under AS 2410; below 0.50 stays in baseline analytics unless another risk signal overrides. The same logic applies to lapping accounts under AS 2401: the posterior does not conclude fraud, but it changes which accounts cross the threshold for expanded cash-receipt testing and sequence-of-posting review.

Worked example 2: merchant cash advance round-tripping

The revenue-cycle pattern above is the textbook case. The version that shows up in real enforcement actions is structurally similar but operationally specific: it cycles through a merchant cash advance.

Merchant cash advance (MCA) — the purchase of a business’s future receivables at a discount. A funder advances, say, $100,000 today; the business agrees to remit a holdback — typically 10-20% of every credit-card sale — until a factor amount, commonly 1.30-1.45× the advance, is repaid. MCAs are legally structured as receivables purchases rather than loans, which keeps them outside state usury caps. The economic substance is short-duration high-rate financing; effective APRs at a 1.40× factor over a four-to-six-month payback period commonly exceed 100%.

The product itself is legal. The accounting fraud the audit team needs to detect lives in how the borrower records the proceeds and the repayments. Three patterns recur:

Reclassification of proceeds. The borrower books the inbound MCA wire as sales revenue rather than as a financing liability, inflating topline and obscuring leverage.
Reclassification of holdback. The daily ACH debits to the funder get coded as cost of goods sold or operating expense, hiding debt service inside otherwise unremarkable income-statement lines.
Related-party laundering. Before recording the MCA proceeds, the borrower routes the cash through a controlled affiliate or related-party processor — a shell payment-processor or friendly vendor that converts the inbound wire into something that looks like ordinary customer payments. The related-party hop is what produces the tight cyclic transition signature the mixture model is built to find.

A fourth pattern, stacking — multiple concurrent MCAs from different funders, each with its own daily holdback hitting the same operating cash account — compounds the others and amplifies the cyclic signature.

The published prosecution and enforcement record on the funder side is extensive and well-attributed; the borrower-side accounting fraud lives in the legal environment those cases established. FTC v. RCG Advances, LLC and Yellowstone Capital, LLC (U.S. District Court for the Southern District of New York, complaint filed June 2020, FTC press release June 10, 2020; settlement and approximately $59 million in consumer relief announced via FTC press release August 3, 2023) produced a permanent injunction against deceptive MCA marketing and collection practices. New York Senate Bill S6395 (signed by Governor Cuomo, August 2019; codified at N.Y. C.P.L.R. § 3218) banned out-of-state confession-of-judgment clauses in MCA contracts; subsequent amendments expanded the restriction. SEC v. Complete Business Solutions Group, Inc., d/b/a Par Funding, et al. (U.S. District Court for the Eastern District of Pennsylvania, civil enforcement filed July 2020; SEC Litigation Release No. 24852) is an approximately $550 million unregistered-securities matter built on MCA receivables as the underlying asset, with parallel DOJ criminal indictments brought in the same district against the principal officers. The New York Attorney General’s office has brought separate consumer-protection actions against multiple smaller MCA funders 2020-2024 (see press releases issued by the Office of the New York State Attorney General, Letitia James, on the OAG website). The DD lesson is that any borrower with material MCA exposure deserves scrutiny on both how it sourced the financing and how it accounted for the cash.

State encoding for the MCA example. Five posting-flow states capture the cycle:

mca_proceeds — inbound wire from MCA funder
related_party — payment to or from a controlled affiliate or shell processor
revenue — sales-revenue postings
cash_op — operating cash accounts
daily_holdback — outbound ACH to the MCA funder

Clean MCA users have a sparse, hub-and-spoke transition pattern centered on cash_op — proceeds land in operating cash and holdback debits exit from it, with related_party rarely touched. Round-tripping users have a tight five-state cycle: proceeds → related_party → revenue → cash_op → daily_holdback → proceeds.


MCA_STATES = ['mca_proceeds', 'related_party', 'revenue', 'cash_op', 'daily_holdback']
N_MCA_STATES = len(MCA_STATES)

# Clean: MCA recorded as financing. Proceeds land in cash_op; holdback debits cash_op directly.
P_mca_clean = np.array([
    [0.05, 0.05, 0.05, 0.85, 0.00],  # mca_proceeds → cash_op
    [0.10, 0.30, 0.10, 0.30, 0.20],  # related_party rarely touched; mostly returns to cash_op
    [0.00, 0.05, 0.15, 0.75, 0.05],  # revenue → cash_op
    [0.10, 0.05, 0.20, 0.45, 0.20],  # cash_op hub
    [0.30, 0.05, 0.05, 0.55, 0.05],  # daily_holdback resets toward mca_proceeds or cash_op
])
# Dirty: proceeds laundered through related_party, booked as revenue, holdback misclassified.
P_mca_dirty = np.array([
    [0.02, 0.85, 0.05, 0.05, 0.03],  # mca_proceeds → related_party (laundering hop)
    [0.02, 0.05, 0.80, 0.10, 0.03],  # related_party → revenue (the misclassification)
    [0.02, 0.05, 0.05, 0.85, 0.03],  # revenue → cash_op (looks normal)
    [0.02, 0.05, 0.05, 0.05, 0.83],  # cash_op → daily_holdback (debt service masked as expense)
    [0.85, 0.05, 0.02, 0.05, 0.03],  # daily_holdback → mca_proceeds (next advance restarts the cycle)
])
assert np.allclose(P_mca_clean.sum(axis=1), 1.0) and np.allclose(P_mca_dirty.sum(axis=1), 1.0)

N_MCA_TOTAL = 200
N_MCA_DIRTY = 20
SEQ_LEN_MCA = 80  # ~one quarter of posting events on the relevant flow
rng_mca = np.random.default_rng(44)
mca_sequences = (
    [sample_sequence(P_mca_clean, SEQ_LEN_MCA, rng_mca) for _ in range(N_MCA_TOTAL - N_MCA_DIRTY)]
    + [sample_sequence(P_mca_dirty, SEQ_LEN_MCA, rng_mca) for _ in range(N_MCA_DIRTY)]
)
truth_mca = [0] * (N_MCA_TOTAL - N_MCA_DIRTY) + [1] * N_MCA_DIRTY

selected_K_mca, fit_mca = select_K(mca_sequences, N_MCA_STATES, K_candidates=[2, 3])
cycle_scores_mca = np.array([cycle_dominance_score(P) for P in fit_mca['P']])
dirty_component_mca = int(np.argmax(cycle_scores_mca))
posterior_mca = fit_mca['responsibilities'][:, dirty_component_mca]
ranked_mca = np.argsort(posterior_mca)[::-1][:30]
print(f"MCA cycle-dominance per component: {cycle_scores_mca.round(3)}")
print(f"True MCA round-trippers in top 30 ranked: "
      f"{sum(1 for idx in ranked_mca if truth_mca[idx] == 1)} / 20 actual")

With seed=44, BIC selects $K = 2$. The dirty component’s cycle-dominance score lands near 0.85 (the five-state loop produces near-unit-circle second eigenvalues); the clean component sits in the 0.30-0.45 range. The top-30 ranked focus list typically captures 17-19 of the 20 true MCA round-trippers. The handful of false positives are clean entities whose related_party activity is unusually high for legitimate business reasons (shared-services cost allocations, intercompany cash sweeps) — and those are exactly the cases the engagement team would want flagged for follow-up classification review anyway.

What the engagement team does with the result. A borrower in the top of the ranked list gets three substantive procedures added to the audit scope. First, trace inbound wires labeled as customer payments back to the originating bank, looking for MCA-funder wire patterns: regular round-dollar amounts, daily or weekly ACH cadence, originator names that resolve to recognizable MCA funders (RCG, Yellowstone, Kalamata, Cabbage, Forward Financing, and several dozen smaller funders are all matters of public record from FTC, NY DFS, and bankruptcy filings). Second, test the operating-expense classification of daily ACH debits against the entity’s reasonable COGS and opex profile; debt-service-sized debits hitting expense accounts on a predictable daily cadence are an analytical-procedure red flag. Third, confirm related-party transactions under AS 2410, with particular attention to payment-processor entities whose ownership traces back to the same principals as the audited entity. The mixture model’s posterior probability is the directional signal that triggers these procedures; the procedures themselves are what produce substantive audit evidence.

Lapping signature: strict-sequencing structure

Lapping (rolling unauthorized debits across customer accounts to mask theft) produces a different signature: a strict-sequencing component where the transition matrix has high off-diagonal mass concentrated on a specific cyclic ordering of customer-account states. The same mixture-of-Markov-chains framework applies; the diagnostic shifts from cycle-dominance score (round-tripping) to transition-entropy score:


def transition_entropy_score(P: np.ndarray) -> float:
    """Average row-entropy of P; low values indicate strict-sequencing structure."""
    with np.errstate(divide="ignore", invalid="ignore"):
        row_entropy = -(P * np.where(P > 0, np.log(P), 0.0)).sum(axis=1)
    return float(row_entropy.mean())

Lapping components have low transition entropy (each state transitions to a small number of next states with high probability). Clean components have higher transition entropy. In this article’s 4-state AR-lifecycle encoding, transition_entropy_score < log(2) ≈ 0.69 is used as a conservative diagnostic threshold: it means the average row behaves as though probability mass is concentrated on fewer than two materially likely next states. That threshold is a screening heuristic, not a theorem; the engagement team should recalibrate it if the clean baseline itself is unusually deterministic.

Lapping mini-example: synthetic 300-account population

The framework supports lapping detection symmetrically to round-tripping. Here is the compact worked example: 300 customer accounts, of which 270 follow a clean AR-aging-and-write-off baseline and 30 follow a strict-sequencing lapping rotation (account $i$'s shortfall is covered by account $i+1$'s subsequent posting, in repeating cycles).


# Lapping signature: customer-account state-transitions encode the AR lifecycle.
# 4 states represent the AR lifecycle for each customer account:
LAPPING_STATES = ['current', 'aged_30', 'aged_60', 'cleared']
N_L_STATES = len(LAPPING_STATES)

# Clean baseline: aging progresses naturally; most accounts clear from current/aged_30.
P_lap_clean = np.array([
    [0.20, 0.35, 0.05, 0.40],   # current → aged or cleared
    [0.05, 0.20, 0.30, 0.45],   # aged_30 → aged_60 or cleared
    [0.02, 0.05, 0.18, 0.75],   # aged_60 → cleared dominates (write-off or payment)
    [0.60, 0.10, 0.05, 0.25],   # cleared → new posting cycle starts
])
# Lapping: deterministic rotation through aging buckets (mask theft via rolling debits).
P_lap_dirty = np.array([
    [0.02, 0.90, 0.05, 0.03],   # current → aged_30 forced
    [0.02, 0.02, 0.90, 0.06],   # aged_30 → aged_60 forced
    [0.02, 0.02, 0.02, 0.94],   # aged_60 → cleared forced (theft masked by next debit)
    [0.94, 0.02, 0.02, 0.02],   # cleared → current restarts the rotation
])
assert np.allclose(P_lap_clean.sum(axis=1), 1.0) and np.allclose(P_lap_dirty.sum(axis=1), 1.0)

rng_l = np.random.default_rng(43)
lap_sequences = [sample_sequence(P_lap_clean, 60, rng_l) for _ in range(270)] + \
                [sample_sequence(P_lap_dirty, 60, rng_l) for _ in range(30)]
truth_lap = [0] * 270 + [1] * 30

selected_K_lap, fit_lap = select_K(lap_sequences, N_L_STATES, K_candidates=[2, 3])
entropy_scores = np.array([transition_entropy_score(P) for P in fit_lap['P']])
lapping_component = int(np.argmin(entropy_scores))  # lowest entropy = strict-sequencing
posterior_lap = fit_lap["responsibilities"][:, lapping_component]
ranked_lap = np.argsort(posterior_lap)[::-1][:35]
print(f"Transition-entropy per component: {entropy_scores.round(3)}")
print(f"True lappers in top 35 ranked: {sum(1 for idx in ranked_lap if truth_lap[idx] == 1)} / 30 actual")

With seed=43, the BIC-selected $K = 2$, the lapping component's transition entropy is near 0.25 (well below the $\log(2) \approx 0.69$ threshold), and the top-35 ranked focus list typically captures 27-30 of the 30 true lapping accounts. The handful of false positives are accounts at the edge of the aging-bucket distribution that the clean baseline assigns lower posterior to — a known and acceptable noise floor.

EM failure modes and defenses

Three patterns recur in EM-based mixture estimation that warrant explicit defenses.

Local maxima. EM converges to a local maximum of the log-likelihood, not the global. Multi-restart initialization (10-20 random restarts) and reporting the best-of-restarts result is the standard mitigation. The implementation above does this. For high-stakes applications, deterministic initialization via $K$-means on transition-count features can be added as a 21st restart.

Label switching. Across restarts, the latent component indices are arbitrary (component 0 in restart 1 may correspond to component 1 in restart 2). The post-hoc identification step in the worked example (assigning component labels by cycle-dominance score, transition-entropy score, or another substantive criterion) handles this. Without it, ranked focus lists become incoherent across runs.

Singular covariance / degenerate components. For very small responsibility weights, the M-step transition-matrix update can produce near-singular $P^{(k)}$. The np.where(row_sums > 0, ...) guard in the M-step implementation above falls back to uniform priors when a row becomes empty. In production, additionally monitor per-component effective sample size $\sum_i \gamma_{i,k}$; components with effective sample size below 5 are likely artifacts and should trigger a warning.

Bridge to Random-Walk and Stationarity Tests on Account Reconciliations

The mixture framework handles heterogeneity across counterparties for a single time slice. Random-Walk and Stationarity Tests on Account Reconciliations handles heterogeneity across time for a single account — the random-walk and stationarity tests on account reconciliations diagnose drift over time without requiring multiple sequences. Combined, the two articles cover the cross-sectional and temporal dimensions of the heterogeneity problem.

Authority:

Mixture-model theory:

Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). "Maximum Likelihood from Incomplete Data via the EM Algorithm." Journal of the Royal Statistical Society, Series B, 39(1), 1-38.
McLachlan, G., & Peel, D. (2000). Finite Mixture Models. Wiley. (Comprehensive reference.)
Smyth, P. (1997). "Clustering Sequences with Hidden Markov Models." Advances in Neural Information Processing Systems 9, 648-654. (Direct methodological precedent.)
Schwarz, G. (1978). "Estimating the Dimension of a Model." The Annals of Statistics, 6(2), 461-464. (BIC.)

Markov-chain spectral theory:

Norris, J.R. (1997). Markov Chains. §1.8 (eigenvalue spectrum and mixing).

Audit standards and forensic literature:

PCAOB AS 2410 — Related Parties.
PCAOB AS 2401 — Consideration of Fraud in a Financial Statement Audit, §A.5 (lapping reference).
ACFE Fraud Examiners Manual, §1.7 (data-driven fraud testing).
Bonner, S.E. (2008). Judgment and Decision Making in Accounting. Pearson, Ch. 7 (ranked focus lists in audit judgment).

Merchant cash advance enforcement and policy record:

FTC v. RCG Advances, LLC, and Yellowstone Capital, LLC. Complaint filed June 2020; settlement announced August 2023 (approximately $59 million in consumer relief, plus permanent injunctive relief against deceptive practices in MCA marketing and collection).
New York S6395 (2019). Banned confession-of-judgment clauses in MCA contracts as applied to non-New-York borrowers; later amendments expanded the restriction.
SEC v. Complete Business Solutions Group ("Par Funding") / Cash4Cases, et al. Filed July 2020. Approximately $550 million unregistered-securities matter built on MCA receivables as the underlying asset.

Companion code on GitHub

Runnable Python artifact reproducing this article's worked example end-to-end under seed=42: stochastic_markov/004_mixture_round_tripping.py in noahrgreen/dd-tech-lab-companion.

Clone the repo and run with python stochastic_markov/004_mixture_round_tripping.py.

Markov Mixture Models for Round-Tripping and Lapping Detection

The mixture model

EM estimation

Component-count selection

Round-tripping signature: cycle detection from $P^{(k)}$

Worked example 1: synthetic 500-counterparty revenue-cycle round-tripping

Worked example 2: merchant cash advance round-tripping

Lapping signature: strict-sequencing structure

Lapping mini-example: synthetic 300-account population

EM failure modes and defenses

Bridge to Random-Walk and Stationarity Tests on Account Reconciliations

Companion code on GitHub

Submit a Comment Cancel reply

Recent Posts

Recent Comments

Sheepdog Prosperity Partners LLC

Contact

Schedule