Reasoning backbone
All reasoning agents (Novelty Scout, Mechanism Synthesizer, Evidence Grader, Red Team, Cited-claim Verifier) call the same backbone: gpt-5-2 via kie.ai with reasoning effort set to high. One endpoint, one auth header, no fallbacks. If the model is unreachable or the response cannot be parsed into the expected JSON shape, the task fails hard rather than ship a templated stub to the dossier.
- model
- gpt-5-2
- effort
- high
- auth
- Bearer ${KIE_API_KEY}
- on failure
- Task fails. No templated stub is ever written.
DAG flow
Upstream agents pull real data from public APIs in parallel. Reasoning agents run sequentially over the typed evidence corpus. Output agents assemble, verify and ship.
upstream → PubMed · UniProt · AlphaFold · ChEMBL · OpenTargets (parallel)
ingest → typed evidence corpus
reason → Novelty Scout
→ Mechanism Synthesizer
→ Evidence Grader (A | B | C | D | X)
critique → Red Team rejects unsupported claims
verify → Cited-claim Verifier re-checks each citation
assemble → Dossier Assembler emits markdown + commit memo
ship → only on verified citations + clean parseCited-claim verifier
A separate agent re-reads each citation that survived synthesis and checks that the cited paper actually supports the attached claim. Embedding match plus targeted LLM verification, scored A→D. Failed citations are demoted from the dossier with a note in the appendix.
for claim, citation in dossier:
abstract = pubmed.fetch(citation.pmid).abstract
sim = cosine(embed(claim), embed(abstract))
verdict = llm.classify(claim, abstract)
∈ { supports, contradicts, unrelated, partial }
grade = combine(sim, verdict)
if grade ≤ "C": demote(claim) · log(citation, grade)Reproducibility budget
Every claim ships with a reproducibility receipt. Anyone can re-run the original mission with a fresh salt — outputs are diffed and meaningful divergence demotes the original claim automatically. Ships in phase 5.
Agent registry
Five upstream agents pull data from public APIs. Four reasoning agents synthesise, score and critique. Three output agents assemble, verify and ship. Every agent is replaceable: they are pure functions over a typed evidence corpus.
- Literature MinerPubMed E-utilities
- Sequence & StructureUniProt · AlphaFold
- Target & PathwayOpenTargets · Reactome
- Variant LinkerChEMBL release 35
- Patent SentinelUSPTO · EPO · JPO · CNIPA
- Novelty ScoutWhitespace · embedding density
- Mechanism Synthesizerkie · gpt-5-2 · effort high
- Evidence GraderA | B | C | D | X rubric
- Red Teamadversarial re-runs · staked
- Cited-claim Verifierembedding + LLM verdict
- Dossier Assemblerdeterministic · markdown
- Reproducibility Daemonrandom + on-demand audits