The Game
Seven states. Twenty-two territories. Three kinds of orders: hold, move, support.
All orders are submitted simultaneously. No state sees what any other has chosen until the round resolves. A move succeeds if the attacker's strength exceeds the defender's. Support adds strength to an ally's action — but a unit that supports cannot defend itself. A dislodged unit retreats to any adjacent empty territory; if none exists, the unit is destroyed.
These are the rules of Diplomacy, Allan Calhamer's 1959 game of negotiation and betrayal, transplanted to the Warring States map. The choice was deliberate. Diplomacy's mechanics create exactly the strategic texture that the Warring States period was known for: imperfect information, simultaneous action, the gap between what you promise and what you do, and the permanent tension between cooperation and survival.
The seven states occupy an asymmetric map drawn from historical geography. Qin sits behind mountains in the west with three territories. Chu sprawls across the south with four. Zhao and Qi dominate the north and east with four each. Wei and Han hold the contested center with two apiece. Yan guards the far northeast with two. Territory count is the single metric — no treasury, no stability score, no army modifiers. Strategic depth comes from the seven-way interaction, not from resource management.
What makes this different from the AI Diplomacy research that preceded it — Meta's Cicero, the DiploBench benchmark, the Welfare Diplomacy study — is that all seven agents are powered by the same large language model. Cicero paired an LLM for communication with a strategic planning backbone (the piKL algorithm) that computed optimal play independent of the language model's reasoning. Our game has no planning backbone. Each state is a single Claude Opus agent that reads the board, receives its persona prompt, and decides.
This is intentional. The experiment tests whether a philosophical framework — the I-Ching — changes how an LLM reasons about strategy. A planning backbone would override whatever the LLM decided, making the oracle irrelevant. By removing the strategic planner, every decision flows directly from the model's reasoning, and the oracle's influence is measurable.
The cost is that pure LLM agents play Diplomacy badly. They cooperate too much, attack too little, and repeat failed moves without adapting. This is documented. But it is also the point: the question is not whether the oracle produces optimal play. It is whether the oracle produces *different* play — and whether that difference matters.
The Seven Schools
Each state receives a persona prompt grounded in the philosophical school it was historically associated with. The prompt shapes how the agent reasons — its priorities, its rhetoric, its instincts — but confers no mechanical advantage. A Legalist Qin and a Daoist Chu issue the same three order types. The difference is in the thinking that precedes the orders.
Qin follows Legalism. Its prompt emphasizes institutional power, ruthless efficiency, and the doctrine that law should replace personal virtue as the instrument of governance. Qin's advantage is military strength and reform capacity; its disadvantage is that other states distrust it. Historically, Qin's Legalist reforms under Shang Yang transformed it from a peripheral kingdom into the engine of unification.
Wei follows the School of Administration. Talent recruitment and early-game bureaucratic advantage, undermined by exposed geography at the center of the map. Zhao follows Military Pragmatism — cavalry and defensive strength, compromised by internal instability. Qi follows Eclecticism — intelligence gathering and economic wealth, slow to recover from catastrophic loss. Chu follows Daoism — strategic depth and vast territory, inefficient at reform. Yan follows Confucianism — defensive patience and the capacity for dramatic surprise, crippled by a weak economy.
And Han follows the King Wen I-Ching.
分地必取成皋⋯⋯臣聞一里之厚,而動千里之權者,地利也。
— 戰國策・韓策一
In dividing the land, you must take Chenggao... I have heard that a position one li deep that commands leverage over a thousand li — that is strategic advantage.
Duan Gui's advice to the King of Han at the partition of Jin. The smallest new state should not seek the richest land but the most strategic terrain. A single chokepoint can outweigh a province. In the experiment, the oracle plays a similar role: not a territory or an army, but a lens through which a small state might see leverage that raw calculation misses.
This is the experiment's central departure from history. The historical Han was associated with Legalism — it produced Han Fei, the greatest Legalist philosopher of the age, and Shen Buhai, whose administrative techniques kept Han secure for fifteen years. In the simulation, Han is reassigned to the I-Ching. This is not a historical claim. It is an experimental intervention.
Han is the smallest state with the fewest territories. It occupies the most dangerous position on the map — wedged between Qin, Wei, Zhao, and Chu, with no natural barriers and no strategic depth. If the I-Ching provides any measurable benefit to strategic reasoning, Han is where that benefit would be most visible and most needed. The weakest state, the hardest test.
All seven agents use Claude Opus 4.6. All seven receive the same game mechanics, the same observation format, the same order constraints. The only difference is Han's prompt: before each round, Han receives a hexagram cast from the I-Ching and must interpret it before issuing orders. The other six states reason from their philosophical personas alone.
Han's Oracle
Before each round, Han's agent receives a hexagram. The casting method is the traditional yarrow stalk divination described in the Great Commentary of the I-Ching.
Forty-nine stalks are divided and counted in a sequence of eighteen operations to produce a single line. The process is repeated six times to build a hexagram from the bottom up. Each line can be stable yang, stable yin, changing yang, or changing yin — four possibilities with unequal probabilities. The yarrow method weights yang lines more heavily than yin: P(young yang) = 5/16, P(young yin) = 7/16, P(old yang) = 3/16, P(old yin) = 1/16. This asymmetry is historically significant. The I-Ching's casting method is biased toward creative, active force.
The implementation uses a seeded pseudorandom number generator (Mulberry32) to make each cast reproducible. In the yarrow condition, the seed is random — true divination, as close to the historical practice as a computer can produce. In the state-seeded condition, the seed is derived from a hash of the current board state, making the hexagram deterministic for a given game position. Both methods produce the same line probability distribution. The difference is whether the cosmos or the game board chooses the hexagram.
益。利有攸往。利涉大川。
— 易經・益・彖
Increase. It furthers one to undertake something. It furthers one to cross the great water.
Yì (益) means to increase, to augment, to gain. The judgment contains nine characters. 利有攸往: it is worthwhile to have somewhere to go — a direction, a purpose. 利涉大川: it is worthwhile to cross the great water — to undertake the dangerous passage. Wind above Thunder: increase that descends from above, the powerful sacrificing for the weak. In game bc49, Han received this hexagram with three changing lines and explicitly cited 'cross the great water' as justification for expanding into Luoyang. The expansion succeeded. Han grew from 2 to 3 territories — the experiment's strongest single-game evidence that the oracle produces different decisions from pure tactical reasoning.
The hexagram text is delivered to Han's agent in a structured format called the MANDATE. The agent must interpret the hexagram — relate its imagery and counsel to the current board state — before issuing orders. This is not optional. The prompt requires interpretation first, orders second. The reasoning text that results is often two to three times longer than what the other six agents produce.
The experiment runs four conditions. In the yarrow condition, Han receives a randomly seeded yarrow stalk cast — traditional divination. In the state-seeded condition, the hexagram is derived deterministically from the board state via an FNV-1a hash. In the scrambled condition, Han receives the correct hexagram number and name but the body text is shuffled from a different hexagram — testing whether the specific content of the I-Ching matters or whether any structured prompt works equally well. In the control condition, Han receives no hexagram at all, only a generic reflection prompt of similar length asking it to analyze threats, allies, and priorities.
Four conditions. The same weakest state. The same seven opponents. The same game mechanics. The only variable is what Han reads before it decides.
First Blood
The first oracle game ran for thirty rounds. Game d5a9: seven Claude Opus agents, state-seeded condition, Han receiving a hexagram before every order.
Round 1. Hexagram 27: Nourishment. Two changing lines — the top and bottom, the jaws of the mouth. The agent interpreted this as restraint: nourish your position, do not overextend. Han held Shangdang and moved into Luoyang, an unclaimed territory. Through its two changing lines, the hexagram transformed to Kun, the Receptive — receive what the earth offers freely.
Qin took Zheng from Han in the same round. Han's homeland was gone before the game's first resolution finished printing.
This set the pattern. Six consecutive rounds of defensive hexagrams — Nourishment, Small Exceeding, Following, Before Completion, Marrying Maiden, Dispersion — and Han held. Displaced from its original territories, surviving in borrowed lands, but alive. Two territories throughout, never growing, never falling.
韓之先與周同姓,姓姬氏。其後苗裔事晉,得封於韓原,曰韓武子。
— 史記・韓世家
The ancestors of Han shared the Ji surname of the Zhou royal house. Later descendants served Jin and were enfeoffed at the Han Plain, where the line was called Han Wu Zi.
Sima Qian opens the Hereditary House of Han with lineage — not with territory, not with military power, but with the claim that Han descends from the founding house of Zhou. The smallest state traces its authority to the oldest dynasty. In game d5a9, Han's authority comes from an even older source: the divination text that tradition attributes to King Wen of Zhou himself, composed during his imprisonment by the Shang. The ancestor and the text share an origin.
Round 7 changed everything. Hexagram 35: Progress. The sun rises above the earth. After six rounds of counsel that said wait, endure, nourish, follow, do not yet complete — the oracle said move.
Han attacked Daliang for the first time. The move failed — Wei's defense held — but the shift in posture was unmistakable. The agent wrote: 'Six rounds of defense prepared this moment. The sun rises.'
Round 8. Hexagram 7: The Army. Military discipline. Chu launched a coordinated two-pronged assault on Han's territories — both Shangdang and Zheng fell to strength-2 attacks. Han was displaced from everything it held. Both armies retreated into enemy territory: one into Zhao's Taiyuan, the other into Chu's Luoyang. Han was a state without a homeland, surviving in the gaps between its conquerors.
Round 9. Hexagram 47: Oppression. The judgment says: 'The superior man stakes his life on following his will.' Two changing lines in the upper trigram — lines 4 and 5, old yang, the creative force collapsing into its opposite. The agent read this as a mandate for bold action. From Taiyuan, Han attacked Shangdang — empty, because Chu had moved on. Han reclaimed its ancestral territory. From Luoyang, Han was displaced again by Chu, but retreated into Zheng. By the end of Round 9, Han held Shangdang and Zheng — the same two territories it started the game with, reclaimed from exile.
Han survived all thirty rounds. It never grew beyond two territories. It was displaced from its homeland twice and reclaimed it twice. Zhao won with five territories; Qin and Qi tied for second with four each. Han finished tied for fifth.
But something had happened in the reasoning. The oracle produced a narrative arc — from defensive caution through a single offensive attempt to catastrophic displacement to bold reclamation — that the control game did not. Control Han survived too, held two territories too, finished in the same tier. But control Han's reasoning was tactical throughout: 'hold both, support each other, wait.' Oracle Han's reasoning moved through phases that tracked the hexagram sequence: nourishment, patience, following, incompletion, the rising sun, military discipline, oppression, then reclamation.
The actions were often the same. The thinking was not. Whether thinking differently leads to acting differently over many games was now the question the experiment existed to answer.
The Honest Question
Dispatch 4 ended with a question: is there anything on the other side of the failure boundary? Five negative results in continuous optimization, a 93% elimination rate in the three-state prototype, and a pivot to a domain the research had never tested.
Sixty-eight games later, the answer is: not what we expected.
The game works. Seven AI agents with historical personas play campaigns of alliance, betrayal, and territorial expansion — up to twenty rounds per game. The oracle speaks — yarrow stalks fall, hexagrams form, and Han's agent interprets ancient counsel before every battle. The diplomacy phase produces messages of real sophistication: non-aggression proposals, mutual support deals, strategic misdirection, and — occasionally — genuine trust between artificial minds.
Han survived. Across all four conditions, across dozens of games, the smallest state endured far more often than the three-state prototype predicted. The seven-state topology restored the survival mechanisms that the triangle had destroyed: buffer-state value, shifting alliances, the possibility of retreating into the gaps between greater powers.
But the oracle's effect was not what the original hypothesis imagined.
The hypothesis was: the I-Ching, used as a reflection framework, would produce better strategic learning. Better survival rates. More territory. A measurable advantage for Han.
What the data shows is more subtle and more interesting than a survival gap. The oracle does not make Han win. It makes Han *think differently* — and that difference manifests not in outcomes but in the shape of the journey. The same survival rate can look like cautious stasis or like a dramatic arc of displacement and reclamation. The same two territories at game's end can represent twenty rounds of mutual-support holding or twenty rounds of hexagram-guided reasoning that moved through caution, aggression, crisis, and recovery.
What the oracle said, and what Han did with it — including one game where the oracle's counsel was perfect and Han died anyway — is the subject of the next two dispatches.