Holonic BRAID: Lite Paper

How shared reasoning graphs improve on BRAID at scale

OASIS / NextGen Software · January 2026

Abstract

Holonic BRAID combines OpenSERV’s BRAID (Bounded Reasoning for Autonomous Inference and Decisions) [1] with OASIS’s holonic architecture so that many agents share one reasoning-graph library. BRAID uses bounded, Mermaid-based instruction graphs instead of unbounded natural-language chain-of-thought; Amçalar & Cinar show that structured machine-readable prompts substantially increase reasoning accuracy and cost efficiency on GSM-Hard, SCALE MultiChallenge, and AdvancedIF, and that encoding steps in symbolic form yields the same level of accuracy and consistency with low-capacity models as with larger models [1]. The paper reports substantial PPD gains—e.g. efficiency gains of 30× on procedural tasks and up to 74× on mathematical reasoning for a reported configuration (gpt-4.1→gpt-5-nano-minimal on GSM-Hard, PPD 74.06 vs baseline) [1]. At scale, BRAID without sharing loses most of that gain because each agent or run pays for graph generation. Holonic BRAID stores graphs as holons in a shared, persistent library: each graph is generated once per task type and reused by all agents, and holons are replicated across storage providers (e.g. MongoDB, Solana, IPFS) so the same graph is available regardless of which chain or backend an agent uses. This restores large PPD gains at scale and extends BRAID’s consistency and accuracy by reuse of validated graphs, accuracy-ranked selection, and versioning. This lite paper defines the model, gives cost and PPD equations, explains holons and how they work across providers and chains, states how Holonic BRAID improves consistency and accuracy, and provides technical diagrams.

1. Introduction

1.1 Motivation

BRAID [1] gives large PPD gains vs a GPT-5-medium baseline when the solver uses a pre-built reasoning graph; the paper reports efficiency gains of 30× on procedural tasks and up to 74× on mathematical reasoning for selected configurations. At scale (many agents, many tasks), BRAID without sharing pays for graph generation per agent or per task, so PPD vs the baseline collapses. Holonic BRAID adds a shared graph library whose graphs are stored as holons—replicated across storage providers so the same graph is available to any agent on any chain or backend. The system then pays for each graph once per task type and preserves large PPD gains at scale.

1.2 Contribution

We define the Holonic BRAID model and its cost structure, explain how holons work across storage providers and chains, and show how Holonic BRAID extends BRAID’s accuracy and consistency gains at scale. Specifically, we:

State the cost and PPD equations for the GPT-5-medium baseline, BRAID, and Holonic BRAID.
Explain holons in enough detail to show how the same graph is available to any agent on any chain or backend.
Explain how Holonic BRAID improves on BRAID (sharing, persistence, consistency/accuracy at scale).
Provide technical diagrams for the two-stage BRAID protocol, holon-backed graph library, and end-to-end Holonic BRAID flow.

2. BRAID: Bounded Reasoning

We summarize BRAID [1] only as needed for Holonic BRAID. BRAID separates prompt generation from prompt solving:

Stage 1 (Generator): A high-tier model produces a reasoning graph (Mermaid flowchart) that encodes the reasoning topology for a task type.