The Cypher Patterns for Transaction-Graph Anomaly Detection article in this sub-series introduced cycle detection on transaction graphs for round-tripping — a payment that returns to its originator through one or more intermediaries. This article extends the cycle-detection apparatus to two adjacent forensic patterns where the time dimension is the diagnostic rather than the convenience. Wash sales under IRC §1091 disallow loss recognition on the sale of a security if the same security (or a substantially-identical one) is purchased within a 30-day window before or after the sale; sophisticated wash-sale arrangements cycle the trade through related accounts to obscure the substantial-identity link, and graph-based detection over the trade-account network is the right framework for finding them. Layering under the FATF AML typology moves funds through a chain of accounts and asset classes specifically to obscure the chain’s beginning from its end; the longer the chain and the shorter the time window between hops, the more typology-fit the pattern.
Both patterns are time-windowed cycles or chains on a transaction-and-account graph. This article walks the Cypher idioms for each, the worked example on a synthetic securities-trading graph, and the operational integration with the audit-defensibility discipline that converts a graph-flagged pattern into a confirmed wash-sale disallowance or a SAR-filing recommendation.
Wash sales beyond same-account detection
IRC §1091 prohibits the deduction of a loss on the sale of “stock or securities” if “substantially identical” stock or securities are acquired within a 61-day window — 30 days before the sale through 30 days after. The disallowed loss is added to the basis of the replacement securities, deferred rather than denied. The mechanics are straightforward when the sale and purchase happen in the same account: the broker’s 1099-B reports the wash-sale adjustment automatically, the taxpayer’s return reflects the deferral, the audit trail is the brokerage statement.
The easy 20% of wash-sale detection is the same-account case. The structural 80% is the cross-account case — sales in one account, replacement purchases in a related account controlled by the same beneficial owner. Rev. Rul. 2008-5 confirmed that §1091 applies across accounts: a sale at a loss in a taxable account followed by a purchase of substantially identical securities in an IRA within the wash-sale window triggers disallowance, with the wash-sale-adjusted basis lodged in the IRA. The broker cannot detect this on a single 1099-B; the IRS expects the taxpayer to self-report. Sophisticated arrangements push further — sales in a personal account, replacement purchases in a spouse’s account or a grantor trust, with the substantial-identity link obscured by the entity boundary.
The detection problem becomes a graph problem: identify pairs of trades (sell at loss, buy of substantially identical security) within the 30-day window, in accounts that share a control relationship through ownership, officer-role, or beneficial-ownership overlap.
The trade-account-security graph schema
The schema for wash-sale and layering work has four primary node types and three edge types.
// Node types
(:Trade {trade_uid, trade_date, direction, quantity, price, realized_loss})
(:Account {account_number, account_type, opened_date})
(:Security {cusip, ticker, security_type, isin})
(:Person {full_name, ssn_hash, tax_id_hash})
// Edge types
(:Trade)-[:IN_ACCOUNT]->(:Account)
(:Trade)-[:OF_SECURITY]->(:Security)
(:Account)-[:CONTROLLED_BY]->(:Person)
The :CONTROLLED_BY edge from :Account to :Person is the cross-account linkage that turns same-account wash-sale detection into cross-account detection. Control can be direct (account-holder is the person), indirect (account is held by an LLC the person owns, a trust the person is grantor of, a custodial account where the person is custodian), or beneficiary-based (account is a grantor trust whose grantor is the person). The graph schema accommodates all three with a single edge type and a relationship_type property documenting the specific control mechanism.
The substantial-identity test from IRC §1091 is the harder modeling problem. Two securities are “substantially identical” if they represent essentially the same economic interest: same issuer, same class, same rights. ETFs tracking the same index are NOT substantially identical (per IRS rulings, though aggressively-similar ETFs remain a gray area). Bonds with different maturity or different issue dates are NOT substantially identical. The schema handles this with a :SUBSTANTIALLY_IDENTICAL relationship between :Security nodes, populated by reference data from the brokerage’s security master, with each link carrying a basis property documenting the substantial-identity determination.
Time-windowed cycle queries in Cypher
The canonical wash-sale candidate query joins a sell-at-loss trade with a buy-of-substantially-identical-security trade within the 30-day window, across accounts that share a control relationship.
MATCH (sell:Trade {direction: 'SELL'})-[:OF_SECURITY]->(sec:Security)<-[:OF_SECURITY]-(buy:Trade {direction: 'BUY'})
MATCH (sell)-[:IN_ACCOUNT]->(sell_acct:Account)
MATCH (buy)-[:IN_ACCOUNT]->(buy_acct:Account)
WHERE sell.trade_date <= buy.trade_date + duration('P30D')
AND buy.trade_date <= sell.trade_date + duration('P30D')
AND sell.realized_loss > 0
AND (
sell_acct = buy_acct
OR EXISTS {
MATCH (sell_acct)-[:CONTROLLED_BY*1..3]->(ctrl)<-[:CONTROLLED_BY*1..3]-(buy_acct)
WHERE NOT sell_acct = buy_acct
}
)
RETURN sell.trade_uid, buy.trade_uid, sec.cusip, sec.ticker,
sell.trade_date, buy.trade_date,
sell.realized_loss AS disallowed_loss_candidate,
sell_acct.account_number, buy_acct.account_number,
CASE WHEN sell_acct = buy_acct THEN 'same_account'
ELSE 'related_account_via_control' END AS pattern_type
ORDER BY disallowed_loss_candidate DESC
LIMIT 100;
Three clauses deserve attention. The bidirectional 30-day check matters because IRC §1091 covers both pre-sale and post-sale windows — a buy 25 days before the sale triggers the rule just as a buy 25 days after. The variable-length CONTROLLED_BY*1..3 traversal allows up to three hops of control linkage; tighter limits (1..2) are faster but miss cross-account patterns that go through a grantor-trust-of-LLC structure; looser limits (1..5) catch more patterns but produce more false positives at the boundary. The realized_loss > 0 filter restricts to candidate wash-sale events; gains and break-even sales aren’t subject to §1091.
The result is disallowed_loss_candidate framing — the practitioner verifies the candidate against the substantial-identity test and the §1091(d) basis-adjustment mechanic; the query queues, the practitioner determines.
Layering detection
The FATF (2018) Concealment of Beneficial Ownership typology characterizes layering as the movement of funds through a chain of accounts and asset classes specifically to obscure the chain’s beginning from its end. The hallmark patterns: chain length of 3+ hops, short time windows between hops (hours to days, not weeks), and asset-class transitions (cash → securities → cash → digital-asset → cash) that don’t serve any economic purpose other than complexity for its own sake.
The graph query for layering detection is a variable-length path on the transaction graph with a time-window predicate on each hop:
MATCH path = (origin:Account)-[t1:TRANSFERRED_TO]->(:Account)
-[t2:TRANSFERRED_TO]->(:Account)
-[t3:TRANSFERRED_TO]->(:Account)
-[t4:TRANSFERRED_TO]->(destination:Account)
WHERE t2.transferred_at - t1.transferred_at <= duration('P5D')
AND t3.transferred_at - t2.transferred_at <= duration('P5D')
AND t4.transferred_at - t3.transferred_at <= duration('P5D')
AND origin <> destination
AND NOT EXISTS { MATCH (origin)-[:OWNS|CONTROLLED_BY]-(destination) }
RETURN path,
reduce(amt = 0.0, r IN relationships(path) | r.amount_usd) / 4 AS avg_hop_amount,
duration.inDays(t1.transferred_at, t4.transferred_at).days AS chain_duration_days
ORDER BY avg_hop_amount DESC
LIMIT 50;
The 5-day inter-hop window is calibrated to the FATF typology’s “rapid layering” pattern; longer windows are typical of legitimate trade chains and are filtered out. The NOT EXISTS clause excludes legitimate intra-organization transfers where the origin and destination are linked by common ownership — those have economic purpose and shouldn’t be flagged.
Structuring detection
31 USC §5324 prohibits structuring transactions to evade the Bank Secrecy Act’s reporting requirements. The classic pattern: multiple cash deposits just under the $10,000 currency-transaction-report threshold, designed to avoid triggering the report. The graph-based detection projects deposit transactions and clusters them by sender, account, and time-window.
// Identify deposit clustering below the $10K CTR threshold
MATCH (depositor:Person)-[:DEPOSITED]->(d:Deposit)-[:INTO]->(acct:Account)
WHERE d.amount_usd >= 9000 AND d.amount_usd < 10000
WITH depositor, acct, collect(d) AS deposits
WHERE size(deposits) >= 3
WITH depositor, acct, deposits,
[d IN deposits | d.deposited_at] AS dates,
reduce(s = 0.0, d IN deposits | s + d.amount_usd) AS total
WITH depositor, acct, deposits, dates, total,
duration.inDays(min(dates), max(dates)).days AS window_days
WHERE window_days <= 30
RETURN depositor.full_name, acct.account_number,
size(deposits) AS deposit_count,
window_days,
total AS aggregate_amount
ORDER BY aggregate_amount DESC
LIMIT 50;
Three or more sub-CTR-threshold deposits within a 30-day window aggregating to a substantial amount is the canonical structuring pattern. The graph query identifies candidates; the BSA officer reviews each candidate against the institution’s structuring-determination criteria. Filing a Suspicious Activity Report is a determination the BSA officer makes, not the algorithm.
Performance constraints
Variable-length paths on large trade graphs are the canonical Cypher performance hotspot. Three patterns earn their weight on production graphs.
PROFILE every variable-length query. Neo4j’s PROFILE keyword runs the query and returns the operator-by-operator execution plan with database-hit counts. Variable-length-path queries that produce 10x more hits than necessary are almost always missing a selective predicate that should be applied earlier in the plan.
Narrow with selective predicates first. Order the WHERE clauses so the most-selective predicates evaluate first. For wash-sale detection, the sell.realized_loss > 0 predicate is highly selective (loss sales are a small fraction of all sales) — it should constrain the result set before the variable-length CONTROLLED_BY*1..3 traversal expands it.
Use EXISTS { } subqueries to short-circuit. Modern Cypher (Neo4j 5+) supports the EXISTS { ... } syntax that returns true as soon as the first match is found, avoiding the full expansion that a MATCH + RETURN pattern would compute. For the cross-account control check, EXISTS is materially faster than the equivalent MATCH because the algorithm only needs to know whether a control path exists, not enumerate all of them.
The forthcoming Production-Scale DD Graph Operations article in this sub-series covers performance tuning in depth, including the index strategy that makes variable-length-path queries performant at 10M+ relationships.
Worked example
The companion repository ships a synthetic 200-account / 50-security / 100,000-trade graph with 8 seeded wash-sale patterns (4 same-account at varying loss magnitudes, 4 cross-account-via-related-party at varying control-path lengths) and 3 seeded layering chains (varying chain lengths from 3 to 5 hops with inter-hop windows under 5 days). The trade graph is constructed with realistic patterns — normal trading activity provides background noise, the seeded patterns are deliberately placed against that noise to test detection robustness.
The wash-sale query recovers all 8 seeded patterns. The cross-account patterns are correctly identified through the variable-length CONTROLLED_BY*1..3 traversal, with the pattern_type field correctly distinguishing same-account from related-account-via-control. The query also flags 4 plausible false positives: trades that meet all the structural criteria but where document review reveals legitimate reasons (deliberate portfolio rebalancing that just happens to cross the wash-sale window without intent, or two trades with the same security under different beneficial owners who happen to share a controlled entity). The article walks the analyst-resolution of each false positive.
The layering query recovers all 3 seeded chains. No layering-pattern false positives surface against the synthetic graph at the 5-day inter-hop window, but at looser windows (10-day, 14-day) the query begins to flag legitimate trade-and-cash-management chains that share the structural pattern. The article walks the threshold-calibration trade-off.
Operational integration
Three workflow artifacts attach to graph-based wash-sale and layering detection.
Wash-sale candidate review. The query output is a queue of disallowed-loss candidates; the tax professional verifies each candidate against the substantial-identity test, applies the §1091(d) basis-adjustment mechanic, and produces the §6662 accuracy-related-penalty risk assessment. The query queues; the credentialed practitioner determines.
Layering candidate review. The query output is a queue of potential layering chains; the BSA officer or AML investigator verifies each candidate against the institution’s typology criteria and the customer’s known business purpose. The credentialed practitioner makes the SAR-filing determination; the query supports the determination but does not make it.
Structuring candidate review. Same pattern. The query identifies sub-CTR-threshold deposit clusters; the BSA officer evaluates against the institution’s structuring criteria and the customer’s known activity pattern; the SAR-filing decision is the officer’s.
These framings need to be in the operational-integration section of any deployment of the methodology — not buried in a closing disclaimer. The methodology accelerates the investigator’s work by surfacing candidates that pattern-fit the typologies; the final determination remains with the credentialed professional whose license is on the line.
The forthcoming articles in this sub-series — Production-Scale DD Graph Operations and Architecture Choices for the Mid-Size DD Practice — cover the deployment layer for these queries when the trade-graph crosses the operational thresholds where naive deployment hits performance walls.
References
Tax authority:
- IRC §1091 — Loss from Wash Sales of Stock or Securities.
- Treas. Reg. §1.1091-1 — Losses from wash sales of stock or securities.
- IRS Rev. Rul. 2008-5 — Wash sale rules and IRA accounts.
AML / BSA framework:
- FATF (2018). Concealment of Beneficial Ownership. Financial Action Task Force.
- FATF (2009). Money Laundering through the Football Sector. Financial Action Task Force.
- 31 USC §5324 — Structuring transactions to evade reporting requirement prohibited.
- FinCEN. Suspicious Activity Report Filing Manual.
Industry self-regulation:
- FINRA Rule 4530 — Reporting Requirements.
Temporal-graph theory:
- Holme, P., & Saramäki, J. (2012). “Temporal Networks.” Physics Reports, 519(3), 97-125.
Implementation reference:
- Neo4j Cypher Manual — variable-length paths,
EXISTSsubqueries,PROFILEquery-planning.
Reproducible code: Companion repository at github.com/noahrgreen/dd-tech-lab-companion ships the wash-sale / layering / structuring detection query pack: trade-account-security graph schema, the time-windowed cycle queries with bidirectional checking, the control-graph projection scripts, and the synthetic 200-account / 100,000-trade test graph with seeded pattern variants for benchmark testing.
