Experiment 11 · 鼎 The Cauldron

Three Oracles, One Board

三卦同局

Hexagram 50, Fire above Wind. The cauldron receives different offerings and transforms them through the same fire. Three philosophical frameworks — yarrow stalks, tarot cards, unadorned reflection — were placed into the same game engine and subjected to the same heat. The Tuan commentary says: 'The great person cooks in order to sacrifice to the Lord on High.' The experiment is the cooking. What emerged from the vessel was not what went in — and one offering, the Tarot, transformed in a way the fire could not have predicted.

By Augustin Chan with AI · 2026-04-16

Correction

This dispatch presents the Tarot→Qin result (Qin winning 5 of 6 tarot games, Fisher p=0.007) as an established, Bonferroni-surviving finding — "the effect is real," "the first statistically sharp result of the ecosystem-effect research program." That language reflects the n=6 single-campaign snapshot available on the publication date (April 16, 2026) and should be read as provisional to it. The result did not initially replicate: four additional games run the next day produced 0 of 4 Qin wins, dropping the rate to 5 of 10 and the p-value to 0.091 — not significant. The effect was later recovered at p=0.006, but only after a campaign-memory confound was identified and corrected and a clean, single-campaign dataset of N=41 games was assembled. The 5-of-10 win rate never changed; the comparison group did — scattered-campaign memory resets had added noise to the control and yarrow winner distributions. Treat this dispatch's strong significance claims as the contemporaneous record of a finding still pending replication, and see Dispatch 12, "Ecosystem Signatures," for the corrected N=41 analysis that supersedes the statistics below.

The Three-Way Comparison

Dispatch 10 documented an ecosystem effect: the I-Ching oracle did not help Han survive, but it appeared to reshape the board for everyone else. That finding rested on a two-way comparison — yarrow versus control — and left an obvious question unanswered: is the effect specific to the King Wen sequence, or would any symbolic framework produce similar ecosystem distortion?

To answer this, we introduced a third condition. Tarot — three-card spreads drawn each round, situation / hidden influence / recommended posture — gives Han a different symbolic vocabulary for interpreting the same strategic reality. The Fool's Journey instead of the King Wen sequence. Wands and Pentacles instead of trigrams and changing lines. A framework equally rich in metaphor, equally structured in form, but drawing from an entirely different philosophical tradition.

Twenty-four v2 games now form the dataset: eleven control, seven yarrow, six tarot. All played on the same forty-six-territory board, with the same seven LLM agents, the same Diplomacy-style combat rules, the same twenty-round maximum. The only variable is the oracle Han consults — or, in the control condition, the absence of one.

鼎，元吉，亨。
— 易經・鼎・彖
The Cauldron. Supreme good fortune and success.
The judgment of Hexagram 50 promises transformation through containment. The cauldron does not change what is placed within it — it reveals what each offering becomes under heat. Three frameworks, subjected to the same competitive pressure, revealed not a uniform ecosystem effect but a specific, striking, statistically sharp one — concentrated in a single framework.

The design tests a precise question. If yarrow and tarot produce similar ecosystem effects, the mechanism is framework-agnostic: any structured symbolic lens, regardless of tradition, reshapes multi-agent dynamics the same way. If they produce different effects, the mechanism is framework-specific: the particular content of the philosophical tradition matters for how the ecosystem is reshaped.

The cauldron received three offerings. The fire was the same. What emerged — and what this dispatch reports — is one genuinely unexpected transformation in the Tarot condition, and two conditions where the picture is still forming.

What the Tarot Changed

The headline numbers: in only one of twenty-four v2 games did Han reach the final round with anything resembling a strategic position — a control game where Han peaked at four supply centers before collapsing back to one by Round 20. Zero of seven yarrow games. Zero of six tarot games. By the strictest survival metric, all three conditions are equivalent failures. The survival hypothesis remains dead across frameworks.

But survival is a binary that conceals more than it reveals. When Han dies matters as much as whether Han dies, and how much territory Han ever held matters more than either.

Control Han's median survival: Round 20. Seven of eleven control Hans reach the final round not eliminated — but barely. Five of those seven end with exactly one supply center, a single home territory that neighbors never bother to take; two end with two supply centers. Slow strangulation rather than collapse. Yarrow Han's median survival: Round 8. Decisive collapse — five of seven yarrow Hans are eliminated after being reduced to a single supply center. Tarot Han's median survival: Round 16.5. Later than yarrow, but with a distinguishing feature the other two conditions lack: along the way, tarot Han reaches more territory than yarrow or control Han ever do.

鼎有實，我仇有疾，不我能即，吉。
— 易經・鼎・九三
The cauldron has substance. My companion is afflicted and cannot come to me. Good fortune.
The third line speaks of a cauldron with real contents — substance, not emptiness — yet the companion cannot reach it. Tarot Han has substance: more territory, more engagement, more strategic presence. But like the companion in the hexagram, that substance cannot translate into survival. The good fortune belongs to those who can reach the cauldron's contents — the neighboring states who inherit what tarot Han built and lost.

Tarot Han peaks higher. Average peak supply centers: tarot 3.0, yarrow 2.4, control 2.4 — yarrow and control Hans barely exceed Han's starting position of two, while tarot Hans reach three on average before collapse. The Kruskal-Wallis test for peak SCs approaches but does not reach significance (p=0.11) — suggestive with these sample sizes, not conclusive. But the pattern is consistent: tarot Han acquires more territory than either alternative, holds it longer, and is eliminated later.

The tarot framework appears to produce a different arc. Where yarrow Han holds and supports — the stillness that Dispatch 10 documented — tarot Han advances and transforms. The Fool's Journey is a narrative of change, of moving through stages, of confronting rather than yielding. The King Wen sequence counsels keeping still when the mountain stands. The Tarot counsels the journey even when the road is dangerous.

Both frameworks kill Han. But they kill Han differently: yarrow kills early and cheaply, tarot kills late after Han has expanded into contested ground. And the consequences of that difference, it turns out, are not paid by Han.

The Qin Effect

The most striking finding is not about Han. It is about who wins.

In control games, five different states won across eleven games: Yan four times, Qin three, Qi twice, Zhao once, Chu once. A distributed outcome — no single hegemon.

In yarrow games, Yan won four of seven games. Chu won twice. Qi once. A pattern consistent with the ecosystem effect Dispatch 10 described, though at this sample size (n=7) the yarrow-versus-control difference in Yan win rate is not statistically significant (Fisher's exact p=0.63; p=0.17 against the pooled non-yarrow arm of control plus tarot).

In tarot games, Qin won five of six games. Qi won once. Nobody else won at all.

Five out of six. Eighty-three percent Qin dominance in the tarot condition, versus twenty-seven percent in control and zero percent in yarrow. Fisher's exact test against the pooled control-and-yarrow arm returns p=0.007. Against yarrow alone: p=0.005. A permutation test, resampling winners across conditions a hundred thousand times, returns p=0.007 for observing five or more Qin wins in six tarot games under the null hypothesis of no oracle effect. The finding survives Bonferroni correction for the two pre-specified comparisons.

The oracle framework injected into Han did not just change Han's trajectory — in the tarot condition, it changed which state becomes hegemon. This is the first statistically sharp result of the ecosystem-effect research program.

象曰：木上有火，鼎。君子以正位凝命。
— 易經・鼎・象傳
The Xiang commentary says: Fire above wood — the image of the Cauldron. The superior person rectifies their position and consolidates destiny.
Qin rectifies its position — the western edge, behind the pass — and consolidates destiny while the cauldron's heat transforms everyone else. The superior person in the Xiang commentary does not stir the cauldron. They stand in the correct position and let the fire do its work. Qin's five victories in six tarot games are the consolidation of a destiny shaped by another state's philosophical choice.

The proposed mechanism — and it is a proposal, not a tested claim — is the mirror image of the yarrow pattern Dispatch 10 described. Yarrow Han yields, holds territory, cooperates, supports. This appears to create a vacuum in the central corridor that peripheral powers fill. Tarot Han advances, moves into contested space, claims neutrals, engages actively. This does not help Han hold territory long-term (Han still dies), but it creates a different pattern of friction.

Tarot Han's early aggression collides with Wei and Zhao in the central plain. While three states contest the center, Qin — positioned behind the Hangu Pass on the western edge — expands methodically into the vacuum on its own borders. By the time tarot Han collapses in Rounds 15-17, Qin has already built an insurmountable lead. Han's active engagement absorbed the attention and military resources of the central states, shielding Qin from the coalition pressure that normally checks western expansion.

This mechanism story is consistent with the game logs but has not been tested. What has been tested, and what stands, is the outcome: in the tarot condition, Qin wins at a rate incompatible with chance.

What This Does and Does Not Establish

The three-way comparison establishes one claim rigorously and suggests a second that awaits confirmation.

What is established: the tarot condition selects for Qin wins at a rate incompatible with chance, even under conservative multiple-comparison correction. This is a framework-specific effect — the same oracle-injection mechanism that produced Han's behavioral signature in Dispatches 7 and 10, but with a different philosophical framework producing a statistically sharp ecosystem outcome. An agent given the Tarot as a reflective lens behaves in ways that, propagating through multi-agent interactions, concentrate victory in one specific state.

What is suggested but not established: that yarrow, independently, produces a matching ecosystem effect concentrated in Yan. The pattern documented in Dispatch 10 — yarrow Han's passive posture creating opportunities for peripheral powers — is consistent with the yarrow winner distribution here (Yan 4, Chu 2, Qi 1). But at n=7, the yarrow-versus-control difference in Yan win rate does not reach statistical significance (p=0.63 against control alone; p=0.17 against the pooled control+tarot arm). Additional yarrow games are in progress to determine whether the pattern strengthens, stabilizes, or dissolves with more data.

鼎黃耳金鉉，利貞。
— 易經・鼎・六五
The cauldron with yellow handles and a golden carrying bar. Perseverance furthers.
The fifth line — the ruler's position — describes a cauldron properly equipped for its purpose: yellow handles for lifting, a golden bar for carrying. The tarot-Qin finding provides one golden handle. The yarrow-Yan pattern, if it holds with more games, will provide the second. Either way, the experiment has produced something worth carrying: a first statistically rigorous example of a philosophical framework reshaping a multi-agent ecosystem through an agent that does not benefit from the framework.

This distinction matters for what can be claimed. The original two-way comparison in Dispatch 10, corrected and recomputed since publication, shows territory-share differences between yarrow and control (Chu +3.5, Yan +2.2, Qi -2.7, Qin -3.2 in yarrow versus control; final-game-state methodology, n=6 per condition). Those are different statistics than winner counts — territory-share tests a continuous outcome, winner-count tests a discrete one — but neither analysis is well-powered at six games per condition, and no state's territory difference reaches statistical significance. The directional pattern is consistent; the statistical support is weaker than winner-count framing alone would suggest.

What this dispatch adds to the literature is one sharp, correction-surviving finding: different philosophical frameworks can produce measurably different winner distributions in multi-agent strategic environments, even when injected into an agent that does not win under any condition. The Tarot framework, applied to Han, selects for Qin. The mechanism is open. The effect is real.

The alignment implication follows from the established finding alone. If the specific philosophical framework injected into one agent can reshape which other agent wins the multi-agent competition, then alignment-framework choice has ecosystem-level consequences that the alignment literature has not systematically addressed. This is the observation the cs.AI paper will carry forward — with the tarot result as its empirical backbone and the yarrow pattern as a hypothesis still being tested.

From Dispatch to Paper

This dispatch series began with a question: can a three-thousand-year-old algorithm train AI? Five negative results answered no. The research pivoted. Seven AI agents played ninety-eight games of Warring States Diplomacy. The survival hypothesis died twice — once on a frozen board, once on an open field. What survived, unevenly but measurably, was something the original question never anticipated.

A philosophical framework injected into one agent can reshape the multi-agent ecosystem the agent inhabits. The Tarot result establishes this as a real effect. The yarrow result, from Dispatch 10 and from this dispatch's n=7 sample, suggests the same mechanism operates with a different selection target — but awaits the additional games that would let the winner-level signal separate from noise.

The cs.AI paper will carry three claims, each weighted honestly by the current evidence.

鼎玉鉉，大吉，無不利。
— 易經・鼎・上九
The cauldron with a jade carrying bar. Great good fortune. Nothing that does not act to further.
The top line of Hexagram 50 — the culmination. Jade is the most refined material: strong, beautiful, enduring. The jade carrying bar means the experiment has produced something worthy of being carried forward. From dispatches to paper. From blog to archive. From a question about ancient algorithms to a finding about how one philosophical framework — rigorously established, with others still pending — shapes the ecosystems that artificial agents inhabit. Great good fortune. Nothing that does not act to further.

First, that philosophical frameworks injected into a single agent do not improve that agent's competitive outcomes. Han's survival rate is statistically indistinguishable across all conditions. This is a robust null result across ninety-eight games and two map versions. Weight: high confidence.

Second, that at least one framework — the Tarot — produces a measurably different winner distribution in a multi-agent system than control or yarrow (Fisher's exact p<0.01, permutation test p<0.01, survives Bonferroni correction). This is a framework-specific ecosystem effect, demonstrated rigorously for one framework pairing. Weight: high confidence for the tarot-Qin pair; hypothesis-generating for the broader claim that all philosophical frameworks produce distinct ecosystems.

Third, that the specific content of a philosophical tradition — not just the presence of a reflective framework — plausibly determines which other agents benefit from the framework user's behavior. This is the alignment implication, and it follows from the established tarot finding alone: if one framework produces a sharp ecosystem signal, then framework choice matters at the system level, not just the agent level. Weight: the implication is supported by the data; the general claim requires the yarrow expansion to confirm whether the effect replicates with a second framework.

The cauldron has done its work. What was placed inside — two philosophical traditions and a control, ninety-eight games, one persistent question about whether ancient wisdom can shape artificial intelligence — has been transformed by the fire into something narrower than the research originally hoped and sharper than it had any right to expect. Not a training improvement. Not a survival advantage. An ecosystem effect, robustly demonstrated in one condition, strongly suggested in another, and now ready to be carried forward.

The King Wen sequence was composed three thousand years ago to describe the patterns of change in a world its authors observed but could not control. The Tarot emerged from Renaissance Europe to map the soul's journey through transformation. Neither tradition imagined artificial intelligence. Neither tradition imagined that its framework, placed inside a computational agent, would reshape a competitive landscape without benefiting the agent that carried it.

The data, for now, establishes this for the Tarot. The data, with more games, may establish it for yarrow too. Either way, the research program has found its first statistically rigorous evidence that philosophical frameworks are not neutral with respect to the ecosystems that house them.

Influence manifests in the big toe. From the smallest state, consulting the oldest texts, the resonance has begun to travel outward. The cs.AI paper will chart how far it reaches — and which readings of the data survive the additional scrutiny that more games will provide.

Notes

[1]technicalDataset: 24 v2 games — 11 control, 7 yarrow (random_oracle), 6 tarot. All use Claude Opus on the v2 board (46 territories, 27 SCs, 19 corridors). Yarrow games from v2_experiment_01_yarrow campaign; tarot games from v2_experiment_03_tarot; control from v2_experiment_01 and v2_experiment_03_control. Memory banks isolated per campaign. Games played March 29 – April 16, 2026. Additional yarrow games are in progress to strengthen the yarrow-arm sample size.

[2]technicalSurvival analysis (han_survived_to, treating games where Han reached Round 20 as 20): Control mean 16.5, median 20.0 (n=11). Yarrow mean 11.0, median 8.0 (n=7). Tarot mean 15.8, median 16.5 (n=6). Kruskal-Wallis H=3.455, p=0.178. Mann-Whitney control vs yarrow: U=55.5, p=0.100. Peak SCs: control mean 2.4, yarrow mean 2.4, tarot mean 3.0. Kruskal-Wallis H=4.419, p=0.110. The survival and peak metrics are suggestive but not individually significant — they motivate the winner-distribution analysis in the next section, where the signal is sharper.

[3]technicalStatistical tests for the Qin-in-tarot claim. Fisher's exact test, tarot versus pooled control+yarrow: table [[5,1],[3,15]], p=0.0069, odds ratio 25. Fisher's exact, tarot versus yarrow alone: table [[5,1],[0,7]], p=0.0047. Permutation test (B=100,000, seed=42): under the null of no oracle effect on winner selection, the probability of observing 5+ Qin wins in a random 6-game sample from the pooled winner pool is 0.0072. Bonferroni threshold for two pre-specified comparisons (Qin-in-tarot, Yan-in-yarrow) at α=0.05 is p<0.025 — the tarot-Qin result clears this threshold; the yarrow-Yan result does not. For yarrow-Yan, the yarrow-vs-control comparison (4/7 vs 4/11) gives Fisher's exact p=0.63; the yarrow-vs-pooled (control+tarot) comparison (4/7 vs 4/17) gives Fisher's exact two-sided p=0.17, and a one-sided permutation test resampling winners across conditions gives p=0.13.

[4]historicalThe historical Qin conquest followed a similar structural logic. While the central states — Han, Wei, Zhao — exhausted each other in territorial disputes along the Yellow River corridor, Qin expanded westward into Shu (modern Sichuan) and consolidated the Guanzhong plain behind Hangu Pass. When Qin finally turned east, the central states had already weakened each other beyond recovery. Fan Sui's 'distant friendship, nearby attack' (遠交近攻) strategy formalized what the tarot condition produces accidentally: let the center burn while the periphery consolidates. The mechanistic parallel is striking; whether the game's Qin-effect reproduces this mechanism or merely the outcome is an open question.

[5]referenceThe framework-specificity finding connects to emerging work on value-dependent multi-agent dynamics. Dafoe et al., 'Cooperative AI: Machines Must Learn to Find Common Ground' (Nature 2021) established that cooperation is not a single axis but a multi-dimensional space. Our tarot-Qin finding adds a specific empirical instance: a non-cooperative, transformation-oriented framework produces a qualitatively different ecosystem outcome than a cooperative, stillness-oriented framework. Whether this generalizes across other framework-pairs is an open question the ongoing yarrow expansion will help address. See also Christiano et al., 'Deep Reinforcement Learning from Human Preferences' (NeurIPS 2017) for the original RLHF framework whose ecosystem effects this research line measures.

[6]technicalTotal dataset: 74 v1 games (4 conditions, retired) + 24 v2 games (3 conditions) = 98 total games. V2 breakdown: 11 control, 7 yarrow, 6 tarot. Additional yarrow games are being run; an updated dispatch or paper revision will reflect the expanded yarrow sample. Statistical caution: the tarot-Qin finding is robust under the tests reported here (Fisher's exact p<0.01, permutation p<0.01, Bonferroni-corrected p<0.025). The yarrow-Yan pattern is descriptively consistent with Dispatch 10 but not yet statistically separable from control at the winner level. The Han-survival null is rock solid across all 98 games and two map versions.

[7]referencePlain-language explanations of every statistical test cited in this dispatch — p-values, Fisher's exact test, Kruskal-Wallis H, Mann-Whitney U, Bonferroni correction, permutation tests, odds ratios, pre-registration, and our final-game-state methodology — are at /research/methods. If any technical term here is unfamiliar, start there. Reproducer scripts that regenerate every number in this dispatch from raw game data: scripts/dispatch_11_audit.py in the warringstates-engine repository.

[7]referencePhase 1 paper: Chan, A. 'Statistical Properties of the King Wen Sequence: An Anti-Habituation Structure That Does Not Improve Neural Network Training.' arXiv:2604.09234 [cs.LG], 2026. Phase 2 paper (in preparation): working title 'Framework-Specific Ecosystem Effects in Multi-Agent LLM Simulations.' Target: arXiv cs.AI. Data: Zenodo DOI 10.5281/zenodo.14679537 (Phase 1), Phase 2 data to be archived separately on publication. The paper's final scope depends on whether the yarrow expansion confirms the winner-level ecosystem pattern or leaves it as a territory-share-only effect; both outcomes are publishable.

[8]contextThis is Dispatch 11 of the Warring States research series. Dispatches 1-4 documented Phase 1 (ML training experiments, all negative). Dispatches 5-7 documented Phase 2 v1 (74 games, survival hypothesis dead, behavioral signature found). Dispatches 8-9 documented the board redesign. Dispatch 10 introduced the ecosystem effect with a two-way comparison (yarrow vs control) on territory-share metrics. This dispatch introduces the three-way comparison and establishes the tarot-Qin finding as a framework-specific ecosystem signal that survives Bonferroni correction. The yarrow-arm is being expanded; a subsequent dispatch may revisit the three-way comparison once the additional games are in.

Twenty-four games on the open field. The Tarot condition selects for Qin wins at a rate incompatible with chance. Yarrow games continue, and with them the question of whether a second framework produces a matching ecosystem signal. The first rigorous finding of the framework-specificity research program is on the record. The paper, and the next dispatch, will carry what comes after.

Subscribe to receive daily passages from the classical texts that inform this research.

← The Oracle Changes the Board Ecosystem Signatures →