# Theseus the Epistemic Engine

What if your tools could reason about what you know?

Every field that works with evidence faces the same problem: too many sources, too many connections, no way to see what contradicts what. Theseus is an engine that finds the structure in your knowledge.

A Different Kind of Intelligence

Current AI gives you an answer. Theseus builds you a model.

A large language model retrieves from training data. Theseus reasons across your evidence: every source, every claim, every contradiction, traced to its origin.

Current AI

Based on general knowledge, polymers can degrade faster in humid environments due to hydrolysis.

Studies suggest moisture accelerates chain scission in many polymer types.

The specific rate depends on composition and conditions.

Note: I don't have access to your research data.

Theseus

Contradiction Your lab data (entries 47, 112, 183) shows accelerated degradation at >60% RH, but Smith et al. (2023) and Park et al. (2021) report stable performance.

Connection Both cited papers used climate-controlled chambers. Your lab runs ambient humidity. This methodological difference was not flagged.

Gap No source tests polymer X at >80% RH. Structural hole between your humidity cluster and aging cluster.

Provenance 3 support, 2 contradict, 1 methodological tension. Every claim links to source material.

Who This Is For

The same engine adapts to any domain that works with evidence.

Theseus does not know your field. It learns the structure of your evidence and discovers what you could not see by reading one source at a time.

The Core Shift

From storing information to tracking what you believe and why.

Traditional tools store documents. Theseus tracks claims, surfaces contradictions, and updates your belief model as evidence arrives.

Domain	Canonical Model	Local Evidence	Engine Result
Pharmacology	Drug A and B are safe to co-prescribe	Adverse event cluster when combined with Drug C	Tension flagged
Journalism	Official: "isolated incident"	Records show repeated pattern across 3 years	Contradiction surfaced
Archaeology	No documented trade between Site A and B	Matching kiln techniques 32 years apart	Gap identified
Climate	Regional model predicts steady soil moisture	Station data diverged in 2018	Contradiction surfaced
Legal	Deponent: no involvement after 2019	Emails show activity through 2021	Contradiction surfaced
Supply Chain	Three independent suppliers = redundancy	All share one sub-supplier	Hidden dependency

Architecture

The Graph IS the Network

Every object type, every engine pass, every infrastructure decision. Hover a hub node for implementation details. Drag to rearrange. Scroll to zoom.

Under the Hood

A multi-pass ML pipeline that learns from its own operation.

Six analysis passes chain together. Each signal is independent; the engine merges results by maximum score with dominant signal tracking.

theseus / connection-engine

Eight Levels of Intelligence

L1Tool InferencePre-trained models via ONNX

L2Learned ScoringUser feedback trains weights

L3Hypothesis GenFine-tuned LM proposes links

L4Emergent OntologyCategories nobody defined

L5Self-ModifyingPipeline reweights passes

L6Multi-AgentSessions debate explanations

L7Counterfactual"What if this source is removed?"

L8Creative HypothesisGNNs generate novel links

The engine runs in two modes. Railway hosts the web service with ONNX Runtime for fast inference (no PyTorch). Modal dispatches GPU workloads: LoRA fine-tuning, GNN training, KGE embedding. PostgreSQL with pgvector stores embeddings; PostGIS handles spatial queries. Redis + RQ manages three task queues.

Django + DRFPostgreSQLpgvectorPostGISRedis + RQONNX RuntimespaCyscikit-learnModal (PyTorch)PyKEENPyTorch Geometric