§ Agents

Reasoning topology

Real public APIs feed a single reasoning backbone. Every cited claim is re-verified against its source. Every mission is re-runnable on demand with a fresh salt. No fallbacks, no templated stubs.

model
gpt-5-2
gateway
kie.ai
effort
high
verifier
cited-claim · v1

Reasoning backbone

All reasoning agents (Novelty Scout, Mechanism Synthesizer, Evidence Grader, Red Team, Cited-claim Verifier) call the same backbone: gpt-5-2 via kie.ai with reasoning effort set to high. One endpoint, one auth header, no fallbacks. If the model is unreachable or the response cannot be parsed into the expected JSON shape, the task fails hard rather than ship a templated stub to the dossier.

kie.aiPOST /gpt-5-2/v1/chat/completions
model
gpt-5-2
effort
high
auth
Bearer ${KIE_API_KEY}
on failure
Task fails. No templated stub is ever written.

DAG flow

Upstream agents pull real data from public APIs in parallel. Reasoning agents run sequentially over the typed evidence corpus. Output agents assemble, verify and ship.

upstream  → PubMed · UniProt · AlphaFold · ChEMBL · OpenTargets   (parallel)
ingest    → typed evidence corpus
reason    → Novelty Scout
          → Mechanism Synthesizer
          → Evidence Grader (A | B | C | D | X)
critique  → Red Team rejects unsupported claims
verify    → Cited-claim Verifier re-checks each citation
assemble  → Dossier Assembler emits markdown + commit memo
ship      → only on verified citations + clean parse

Cited-claim verifier

A separate agent re-reads each citation that survived synthesis and checks that the cited paper actually supports the attached claim. Embedding match plus targeted LLM verification, scored A→D. Failed citations are demoted from the dossier with a note in the appendix.

for claim, citation in dossier:
    abstract  = pubmed.fetch(citation.pmid).abstract
    sim       = cosine(embed(claim), embed(abstract))
    verdict   = llm.classify(claim, abstract)
                  ∈ { supports, contradicts, unrelated, partial }
    grade     = combine(sim, verdict)
    if grade ≤ "C": demote(claim) · log(citation, grade)

Reproducibility budget

Every claim ships with a reproducibility receipt. Anyone can re-run the original mission with a fresh salt — outputs are diffed and meaningful divergence demotes the original claim automatically. Ships in phase 5.

Agent registry

Five upstream agents pull data from public APIs. Four reasoning agents synthesise, score and critique. Three output agents assemble, verify and ship. Every agent is replaceable: they are pure functions over a typed evidence corpus.

Upstream · ingest
  • Literature Miner
    PubMed E-utilities
  • Sequence & Structure
    UniProt · AlphaFold
  • Target & Pathway
    OpenTargets · Reactome
  • Variant Linker
    ChEMBL release 35
  • Patent Sentinel
    USPTO · EPO · JPO · CNIPA
Reasoning · synthesise
  • Novelty Scout
    Whitespace · embedding density
  • Mechanism Synthesizer
    kie · gpt-5-2 · effort high
  • Evidence Grader
    A | B | C | D | X rubric
  • Red Team
    adversarial re-runs · staked
Output · verify + ship
  • Cited-claim Verifier
    embedding + LLM verdict
  • Dossier Assembler
    deterministic · markdown
  • Reproducibility Daemon
    random + on-demand audits