One day, around two years ago, I had an intuition: that geometry is more fundamental than logic.
As I was researching this interest and posting about what others have written, I ran into Professor McCarty (linked to at the end of this post) who had been doing theoretical research in this area. Understanding his foundational approach is essential to building AI models capable of sustaining a reliable and coherent world model, which is essential for an AI that would very powerful and dramatically more reliable than what we have so far.
The proposed model architecture consists of 3 primary layers, and a 4th one I added for high level goal driven reasoning.
It’s not all theory. The 3 layers of Prof. McCarty have been realized and demonstrated in code, although on simple cognitive tasks, proving learning within this architecture.
The work I’ve been doing is largely experimental and my time spent on it has been constrained by competing work. I thought I should share my thoughts on this approach and point people to Prof. McCarty’s work in the hope of inspiring others to follow this path.
The 4 layers of the proposed model architecture
Layer 1 — Statistics (priors & uncertainty)
- Cognition relies on structured priors and learned regularities: what tends to co-occur, what follows what, what’s typical or surprising.
- Priors aren’t just frequencies; they can encode structural assumptions (e.g., object permanence, causal locality).
- Evidence updates priors (Bayes/predictive processing), while habits/policies are learned action strategies—distinct from beliefs, though we can maintain uncertainty over which policy to use.
Example: You expect dropped objects to fall because your priors encode everyday physics (gravity, friction), not merely patterns of words.
Layer 2 — Structure (frames, invariances, constraints)
- “Structure” covers the representational geometry—how concepts are arranged so meaning is preserved under transformation—and the invariances/symmetries that preserve meaning within a frame. By a frame, I mean the representational choices (state, variables, relations, and allowed transformations) that make some distinctions audible and others silent.
- Each domain has allowed transformations and forbidden ones (constraints).
- Physics: Approximate invariance to translation/rotation/scale; forbidden: energy non-conservation or faster-than-light signaling.
- Law: Paraphrases that preserve legal force are allowed; forbidden: edits that change intent or liability (e.g., shall → may; removing indemnification; swapping representations for warranties).
- Art: Perspective shifts can preserve surreal identity; realism has tighter constraints.
This is the grammar of worlds—what can vary and what must stay fixed.