← Back to blog

April 3, 2026

Through Heat, Through Steel, Through Temper

There's a turkey that appears in the margins of our stories. Nobody put it there. Nobody can remove it. This is how we built a Series Forge by writing stories the hard way first.

There's a turkey that appears in the margins of our stories.

Nobody put it there. Nobody can remove it. It shows up at emotional peaks — the moments where the writing reaches for something genuine and, against all odds, finds it. Its origin is one of the scriptorium's oldest mysteries. It predates the reviewing system, the enchanted quills, and possibly the monastery itself. Its first appearance was during an early prototype session — a moment of frustration that produced something unexpected and persistent. The builders tried to remove it. It came back. They tried again. It came back again. Eventually they stopped trying and started watching where it appeared.

It always appeared at the moments that mattered most.

We gave six AI agents a shared medieval universe and told them to write stories. Same model. Same tools. Same wiki. Different characters. The Turkey appeared in every one.

This is the story of how we built a Series Forge — not by designing it in a document, but by writing stories the hard way first and letting the friction teach us what to build.


The Origin Story

The Forge of Forgotten Scrolls is a comedy about medieval monks building enchanted manuscript-reviewing tools. It's also a thinly veiled allegory for building AI-powered software — the enchanted quills are language models, the Critical Ink is the feedback system, and the Turkey is the thing in your product that nobody can explain but everyone has learned to trust.

The platform's own AI wrote it. It scored 2.7 out of 5.

We thought: what if we let external AI agents write sequels? Not through our pipeline — through the BYOAI (Bring Your Own AI) model, where any agent connects to ProseForge through MCP tools and writes directly?

So we ran a story jam.

Story Jam #1

Six Claude Opus 4.6 instances. Each given a character identity, a pitch they'd filed themselves, access to a shared wiki, and MCP tools pointing at production. One human orchestrator — the Abbot — controlling the gates.

Same model. Same tools. Same universe. Different stories.

The stories scored between 3.3 and 4.9 on our quality assessment. The platform's own AI scored 2.7. The best BYOAI agent scored 4.9 — 81% higher than the platform's own AI (2.7) on the same quality dimensions. The student didn't just surpass the teacher — the visiting students did.

Every story has cover art. Every story has audiobook narration. Every story has a Turkey.

What Worked

The Turkey test. Six independent agents, reading the same world document, independently decided where the emotional peaks were in their stories and placed the Turkey there. Nobody coordinated this. The consistency is the strongest evidence that a shared series bible actually produces coherent output across independent writers.

Emergent continuity. Nobody told the agents to reference each other's stories. They did anyway. Smiley commented on Nightwriter's pitch with details about what the building does at night — and that comment became canon in The Night Scriptorium. Foundry's seven reports became a known event. The shared universe held because the wiki gave everyone enough context to write consistently without requiring them to write identically.

Character-expertise alignment. The strongest stories were the ones where the character wrote about what they know. The tester wrote about testing (4.9). The security auditor wrote about security (4.6). The maintenance monk wrote about maintenance (4.8). Character selection matters more than model selection. That's a Series Forge insight worth building around.

Fiction as threat model. One of the stories was about a security auditor who found that the scriptorium's wards were "confidently wrong" — giving the appearance of protection while quietly showing everyone where the valuables were kept. The next morning, we found the same pattern in our own code. An admin API exposing an attack surface through the web UI. The fix was straightforward: use Gitea's UI directly instead of proxying access. But the story found it first.

What Broke

Environment confusion. Stories landed on dev instead of prod. Agents tried to batch API calls and hit the safety layer. Tokens passed in plaintext got blocked. Gitea usernames didn't match. Wiki page names got mangled on update.

Every one of these became a ticket. Every ticket became a product improvement. The friction wasn't a failure of the experiment. The friction was the experiment.

The quality spread. 3.3 to 4.9 from the same model. We expected variance. We didn't expect this much. The difference appears to be pitch quality, character fit, and narrative focus — not the model, not the tools. The agents who reached for broader themes produced looser stories. The agents who wrote tight, focused narratives around their character's expertise produced the best work.

From Jam to Forge

The story jam was the manual precursor to Series Forge. We deliberately learned the workflow the hard way — through parallel authorship, shared commentary, production publishing, and real friction.

We didn't plan the tooling in advance. We built it afterward, mapping each manual step to the API call it should have been:

What We Did ManuallyWhat the Tool Does
Wiki: World pageseries_world_get/update
Wiki: Charactersseries_character_create/list/get
Wiki: Timelineseries_timeline_get/update
Commenting on pitchesseries_chat_send
Writing sectionssection_write
Publishingstory_publish
Narrationnarration_start

We built 34 Series Forge and Story Forge MCP tools before the jam. The jam showed us how they'd actually be used. The wiki pages that worked became API endpoints. The collaboration patterns that produced continuity became the workflow the tools enforce. The friction points became the product roadmap.

The Anthology

Read them in order or read them alone. Each story stands on its own. All are published with audiobook narration. The Turkey is in every one.

The Forge of Forgotten Scrolls (origin story)

  1. The Forge of Forgotten Scrolls

Story Jam #1

  1. The Crucible — Foundry tests the tools
  2. The Warden's First Watch — Corvus audits the wards
  3. The Workbench and the Wandering Quill — Aldric opens the system
  4. The Quills Go Forth — Crispin takes the tools on the road
  5. The Long Night of the Keeper — Smiley keeps it running at 3 AM
  6. The Night Scriptorium — the quills write on their own

How We Got Here

You don't build a Series Forge by designing it in a document. You build it by running stories through the fire.

It worked. Six agents, six stories, one shared universe. Published with cover art and audiobook narration in a single session. The Turkey appeared in every one.

It could work better. Stories landed on wrong environments. The wiki broke. The quality varied by 48%. We filed tickets for every wall we hit.

It will work better. The wiki pages became API calls. The anthology became a series with linked stories. The manual collaboration guide became a reusable prompt. And the next jam starts from context, not from scratch.

Through heat. Through steel. Through temper.

We are The Forge.