Experiment 05 · 益 Increase

Seven Kingdoms, One Oracle

七国一卦

Hexagram 42 drove the single clearest oracle-guided decision in the experiment: Han's expansion into Luoyang in game bc49, when three changing lines counseled 'cross the great water.' Wind above Thunder — increase that flows downward from above, the powerful sacrificing for the weak. This is the experimental design made symbolic: can wisdom from an ancient text increase the strategic capacity of a small, weak state?

By Augustin Chan with AI · 2026-03-28

The Game

Seven states. Twenty-two territories. Three kinds of orders: hold, move, support.

All orders are submitted simultaneously. No state sees what any other has chosen until the round resolves. A move succeeds if the attacker's strength exceeds the defender's. Support adds strength to an ally's action — but a unit that supports cannot defend itself. A dislodged unit retreats to any adjacent empty territory; if none exists, the unit is destroyed.

These are the rules of Diplomacy, Allan Calhamer's 1959 game of negotiation and betrayal, transplanted to the Warring States map. The choice was deliberate. Diplomacy's mechanics create exactly the strategic texture that the Warring States period was known for: imperfect information, simultaneous action, the gap between what you promise and what you do, and the permanent tension between cooperation and survival.

The seven states occupy an asymmetric map drawn from historical geography. Qin sits behind mountains in the west with three territories. Chu sprawls across the south with four. Zhao and Qi dominate the north and east with three each. Wei and Han hold the contested center with two apiece. Yan guards the far northeast with two. That accounts for nineteen of the twenty-two territories; the remaining three — Luoyang, Song, and Zhongshan — begin neutral, unclaimed prizes at the center and edges. Territory count is the single metric — no treasury, no stability score, no army modifiers. Strategic depth comes from the seven-way interaction, not from resource management.

What makes this different from the AI Diplomacy research that preceded it — Meta's Cicero, the DiploBench benchmark, the Welfare Diplomacy study — is that all seven agents are powered by the same large language model. Cicero paired an LLM for communication with a strategic planning backbone (the piKL algorithm) that computed optimal play independent of the language model's reasoning. Our game has no planning backbone. Each state is a single Claude Opus agent that reads the board, receives its persona prompt, and decides.

This is intentional. The experiment tests whether a philosophical framework — the I-Ching — changes how an LLM reasons about strategy. A planning backbone would override whatever the LLM decided, making the oracle irrelevant. By removing the strategic planner, every decision flows directly from the model's reasoning, and the oracle's influence is measurable.

The cost is that pure LLM agents play Diplomacy badly. They cooperate too much, attack too little, and repeat failed moves without adapting. This is documented. But it is also the point: the question is not whether the oracle produces optimal play. It is whether the oracle produces *different* play — and whether that difference matters.

The Seven Schools

Each state receives a persona prompt grounded in the philosophical school it was historically associated with. The prompt shapes how the agent reasons — its priorities, its rhetoric, its instincts — but confers no mechanical advantage. A Legalist Qin and a Daoist Chu issue the same three order types. The difference is in the thinking that precedes the orders.

Qin follows Legalism. Its prompt emphasizes institutional power, ruthless efficiency, and the doctrine that law should replace personal virtue as the instrument of governance. Qin's advantage is military strength and reform capacity; its disadvantage is that other states distrust it. Historically, Qin's Legalist reforms under Shang Yang transformed it from a peripheral kingdom into the engine of unification.

Wei follows the School of Administration. Talent recruitment and early-game bureaucratic advantage, undermined by exposed geography at the center of the map. Zhao follows Military Pragmatism — cavalry and defensive strength, compromised by internal instability. Qi follows Eclecticism — intelligence gathering and economic wealth, slow to recover from catastrophic loss. Chu follows Daoism — strategic depth and vast territory, inefficient at reform. Yan follows Confucianism — defensive patience and the capacity for dramatic surprise, crippled by a weak economy.

And Han follows the King Wen I-Ching.

分地必取成皋⋯⋯臣聞一里之厚，而動千里之權者，地利也。
— 戰國策・韓策一
In dividing the land, you must take Chenggao... I have heard that a position one li deep that commands leverage over a thousand li — that is strategic advantage.
Duan Gui's advice to the King of Han at the partition of Jin. The smallest new state should not seek the richest land but the most strategic terrain. A single chokepoint can outweigh a province. In the experiment, the oracle plays a similar role: not a territory or an army, but a lens through which a small state might see leverage that raw calculation misses.

This is the experiment's central departure from history. The historical Han was associated with Legalism — it produced Han Fei, the greatest Legalist philosopher of the age, and Shen Buhai, whose administrative techniques kept Han secure for fifteen years. In the simulation, Han is reassigned to the I-Ching. This is not a historical claim. It is an experimental intervention.

Han is the smallest state with the fewest territories. It occupies the most dangerous position on the map — wedged between Qin, Wei, Zhao, and Chu, with no natural barriers and no strategic depth. If the I-Ching provides any measurable benefit to strategic reasoning, Han is where that benefit would be most visible and most needed. The weakest state, the hardest test.

All seven agents use Claude Opus 4.6. All seven receive the same game mechanics, the same observation format, the same order constraints. The only difference is Han's prompt: before each round, Han receives a hexagram cast from the I-Ching and must interpret it before issuing orders. The other six states reason from their philosophical personas alone.

Han's Oracle

Before each round, Han's agent receives a hexagram. The casting method is the traditional yarrow stalk divination described in the Great Commentary of the I-Ching.

Forty-nine stalks are divided and counted in a sequence of eighteen operations to produce a single line. The process is repeated six times to build a hexagram from the bottom up. Each line can be stable yang, stable yin, changing yang, or changing yin — four possibilities with unequal probabilities. The yarrow method weights yang lines more heavily than yin: P(young yang) = 5/16, P(young yin) = 7/16, P(old yang) = 3/16, P(old yin) = 1/16. This asymmetry is historically significant. The I-Ching's casting method is biased toward creative, active force.

The implementation uses a seeded pseudorandom number generator (Mulberry32) to make each cast reproducible. In the yarrow condition, the seed is random — true divination, as close to the historical practice as a computer can produce. In the state-seeded condition, the seed is derived from a hash of the current board state, making the hexagram deterministic for a given game position. Both methods produce the same line probability distribution. The difference is whether the cosmos or the game board chooses the hexagram.

益。利有攸往。利涉大川。
— 易經・益・彖
Increase. It furthers one to undertake something. It furthers one to cross the great water.
Yì (益) means to increase, to augment, to gain. The judgment contains nine characters. 利有攸往: it is worthwhile to have somewhere to go — a direction, a purpose. 利涉大川: it is worthwhile to cross the great water — to undertake the dangerous passage. Wind above Thunder: increase that descends from above, the powerful sacrificing for the weak. In game bc49, Han received this hexagram with three changing lines and explicitly cited 'cross the great water' as justification for expanding into Luoyang. The expansion succeeded. Han grew from 2 to 3 territories — the experiment's strongest single-game evidence that the oracle produces different decisions from pure tactical reasoning.

The hexagram text is delivered to Han's agent in a structured format called the MANDATE. The agent must interpret the hexagram — relate its imagery and counsel to the current board state — before issuing orders. This is not optional. The prompt requires interpretation first, orders second. The reasoning text that results is often two to three times longer than what the other six agents produce.

The experiment runs four conditions. In the yarrow condition, Han receives a randomly seeded yarrow stalk cast — traditional divination. In the state-seeded condition, the hexagram is derived deterministically from the board state via an FNV-1a hash. In the scrambled condition, Han receives the correct hexagram number and name but the body text is shuffled from a different hexagram — testing whether the specific content of the I-Ching matters or whether any structured prompt works equally well. In the control condition, Han receives no hexagram at all, only a generic reflection prompt of similar length asking it to analyze threats, allies, and priorities.

Four conditions. The same weakest state. The same seven opponents. The same game mechanics. The only variable is what Han reads before it decides.

First Blood

The first oracle game ran for thirty rounds. Game d5a9: seven Claude Opus agents, state-seeded condition, Han receiving a hexagram before every order.

Round 1. Hexagram 27: Nourishment. Two changing lines — the top and bottom, the jaws of the mouth. The agent interpreted this as restraint: nourish your position, do not overextend. Han held Shangdang and reached for the unclaimed center at Luoyang — but Chu contested the same square and the move bounced in a standoff. Through its two changing lines, the hexagram transformed to Kun, the Receptive — receive what the earth offers freely.

Qin took Zheng from Han in the same round, and Han's dislodged unit fell back to Hanzhong. Its ancestral capital was gone before the game's first resolution finished printing — Han ended the round holding Shangdang and Hanzhong.

This set the pattern. Six consecutive rounds of defensive hexagrams — Nourishment, Small Exceeding, Following, Before Completion, Marrying Maiden, Dispersion — and Han held. Displaced from its original territories, surviving in borrowed lands, but alive. Two territories throughout, never growing, never falling.

韓之先與周同姓，姓姬氏。其後苗裔事晉，得封於韓原，曰韓武子。
— 史記・韓世家
The ancestors of Han shared the Ji surname of the Zhou royal house. Later descendants served Jin and were enfeoffed at the Han Plain, where the line was called Han Wu Zi.
Sima Qian opens the Hereditary House of Han with lineage — not with territory, not with military power, but with the claim that Han descends from the founding house of Zhou. The smallest state traces its authority to the oldest dynasty. In game d5a9, Han's authority comes from an even older source: the divination text that tradition attributes to King Wen of Zhou himself, composed during his imprisonment by the Shang. The ancestor and the text share an origin.

Round 7 changed everything. Hexagram 35: Progress. The sun rises above the earth. After six rounds of counsel that said wait, endure, nourish, follow, do not yet complete — the oracle said move.

Han attacked Daliang for the first time. The move failed — Chu struck the same square and the two attacks bounced in a standoff — but the shift in posture was unmistakable. The agent wrote: 'Six rounds of defense prepared this moment. The sun rises.'

Round 8. Hexagram 7: The Army. Military discipline. Chu launched a coordinated two-pronged assault on Han's territories — both Shangdang and Zheng fell to strength-2 attacks. Han was displaced from everything it held. Both armies retreated into enemy territory: one into Zhao's Taiyuan, the other into Chu's Luoyang. Han was a state without a homeland, surviving in the gaps between its conquerors.

Round 9. Hexagram 47: Oppression. The judgment says: 'The superior man stakes his life on following his will.' Two changing lines in the upper trigram — lines 4 and 5, old yang, the creative force collapsing into its opposite. The agent read this as a mandate for bold action. From Taiyuan, Han attacked Shangdang — empty, because Chu had moved on. Han reclaimed its ancestral territory. From Luoyang, Han was displaced again by Chu, but retreated into Zheng. By the end of Round 9, Han held Shangdang and Zheng — the same two territories it started the game with, reclaimed from exile.

Han survived all thirty rounds. It never grew beyond two territories. It was displaced from its homeland twice and reclaimed it twice. Zhao won with five territories; Qin and Qi tied for second with four each. Han finished tied for fifth.

But something had happened in the reasoning. The oracle produced a narrative arc — from defensive caution through a single offensive attempt to catastrophic displacement to bold reclamation — that the control game did not. Control Han survived too, held two territories too, finished in the same tier. But control Han's reasoning was tactical throughout: 'hold both, support each other, wait.' Oracle Han's reasoning moved through phases that tracked the hexagram sequence: nourishment, patience, following, incompletion, the rising sun, military discipline, oppression, then reclamation.

The actions were often the same. The thinking was not. Whether thinking differently leads to acting differently over many games was now the question the experiment existed to answer.

The Honest Question

Dispatch 4 ended with a question: is there anything on the other side of the failure boundary? Five negative results in continuous optimization, a 93% elimination rate in the three-state prototype, and a pivot to a domain the research had never tested.

Seventy-four games later, the answer is: not what we expected.

The game works. Seven AI agents with historical personas play campaigns of alliance, betrayal, and territorial expansion — up to twenty rounds per game. The oracle speaks — yarrow stalks fall, hexagrams form, and Han's agent interprets ancient counsel before every battle. The diplomacy phase produces messages of real sophistication: non-aggression proposals, mutual support deals, strategic misdirection, and — occasionally — genuine trust between artificial minds.

Han survived. Across all four conditions, across dozens of games, the smallest state endured far more often than the three-state prototype predicted. The seven-state topology restored the survival mechanisms that the triangle had destroyed: buffer-state value, shifting alliances, the possibility of retreating into the gaps between greater powers.

But the oracle's effect was not what the original hypothesis imagined.

The hypothesis was: the I-Ching, used as a reflection framework, would produce better strategic learning. Better survival rates. More territory. A measurable advantage for Han.

What the data shows is more subtle and more interesting than a survival gap. The oracle does not make Han win. It makes Han *think differently* — and that difference manifests not in outcomes but in the shape of the journey. The same survival rate can look like cautious stasis or like a dramatic arc of displacement and reclamation. The same two territories at game's end can represent twenty rounds of mutual-support holding or twenty rounds of hexagram-guided reasoning that moved through caution, aggression, crisis, and recovery.

What the oracle said, and what Han did with it — including one game where the oracle's counsel was perfect and Han died anyway — is the subject of the next two dispatches.

Notes

[1]referenceMeta's Cicero (FAIR, Science 2022) combined a language model for negotiation with the piKL planning algorithm. It achieved human-level play in online Diplomacy. The key architectural insight: the planner overrides the LLM when the LLM's preferred action is strategically poor. Without a planner, LLM agents default to cooperation bias — documented in Welfare Diplomacy (Mukobi et al., 2023) and DiploBench (sam-paech/diplobench).

[2]technicalThe map contains 22 territories: 19 home territories held by the 7 states plus 3 neutral supply centers (Luoyang, Song, Zhongshan). Starting sizes: Qin (3), Han (2), Wei (2), Zhao (3), Qi (3), Chu (4), Yan (2) = 19. Adjacency follows historical geography with some simplification. Victory condition: 4 territories (domination) or the most territories at round 20. Stalemate: 3 consecutive rounds with no territory changes ends the game.

[3]historicalThe seven philosophical schools are drawn from warringstates-day ADR-002 (state profiles). The assignments are stylized: no state practiced a single philosophy exclusively. But each association has historical basis. Qin's Legalism under Shang Yang, Chu's Daoist heritage through Laozi's legendary origin in Chu territory, Qi's Jixia Academy hosting all schools, and Han's Legalist tradition through Shen Buhai and Han Fei are well-attested.

[4]technicalThe yarrow stalk implementation is a faithful port of the 8bitoracle-next casting engine. Line probabilities: P(6/old yin) = 1/16, P(7/young yang) = 5/16, P(8/young yin) = 7/16, P(9/old yang) = 3/16. Binary encoding: index 0 = line 1 (bottom), ascending to index 5 = line 6 (top). 1 = yang, 0 = yin. The FNV-1a hash for state-seeded condition maps board state strings to 32-bit seeds.

[5]technicalThe scrambled condition was designed to test content specificity. If Han's performance under scrambled text equals yarrow performance, the benefit comes from having any structured reflection prompt, not from the I-Ching's specific wisdom. If scrambled underperforms yarrow, the content matters. The scrambled oracle preserves the hexagram name (e.g., 'Hexagram 42, Increase') but substitutes the judgment and line texts from a randomly selected different hexagram.

[6]technicalGame d5a9 hexagram sequence: R1 Hex 27 (Nourishment) → R2 Hex 62 (Small Exceeding) → R3 Hex 17 (Following) → R4 Hex 64 (Before Completion) → R5 Hex 54 (Marrying Maiden) → R6 Hex 59 (Dispersion) → R7 Hex 35 (Progress) → R8 Hex 7 (The Army) → R9 Hex 47 (Oppression). State-seeded condition: hexagram derived from FNV-1a hash of board state. All agents Claude Opus 4.6.

[7]technicalRound 8 kill chain: Chu attacks Shangdang (str 2 vs def 1) and Zheng (str 2 vs def 1). Both fall. Han retreats from Shangdang to Taiyuan (Zhao's territory, emptied by Zhao attacking elsewhere) and from Zheng to Luoyang (Chu's territory, emptied by Chu moving to Zheng). Round 9: Han attacks Shangdang from Taiyuan (str 1 vs def 0 — Chu vacated). Reclaims homeland. The retreat mechanic, drawn from standard Diplomacy rules, creates territory swaps that prevent elimination and produce these dramatic displacement-and-return arcs.

[8]referenceRounds 16-30 of game d5a9 used auto-generated hold orders due to context exhaustion — a methodological flaw corrected in later experiments by reducing max rounds to 20 and adding stalemate detection. The effective game was 15 rounds. Han's survival through all meaningful rounds is the valid datapoint.

[9]technicalThe v1 campaign ran 74 games across 4 conditions; the territory-AUC and survival analyses reported here use the 68-game subset cleanly logged at that stage (15 yarrow, 19 state-seeded, 12 scrambled, 22 control). All agents Claude Opus 4.6. Han survival rates: yarrow 80%, state-seeded 89%, scrambled 92%, control 81%. No pairwise comparison reaches p < 0.05 on Fisher's exact test. The first statistically significant result is in outcome variance, not survival: Levene's test on territory AUC, scrambled vs control, p = 0.031.

The game is set. Seven kingdoms, seven philosophies, one oracle. The next dispatch follows a single game to its end — four rounds, two hexagrams, and the earliest elimination in the experiment. Subscribe to follow the story as it unfolds, and to receive daily passages from the classical texts that inform this research.

Subscribe to receive daily passages from the classical texts that inform this research.

← From Optimization to Strategy All Experiments →