Home · Level 1 · Session 2 · Session 2.1
🔬 Level 1 of 5 · Session 2.1 · Deep Dive

Level 1 · Foundations

The Pentadic Pipeline

Five layers from raw evidence to embodied judgment

World → Perception → Agentic → Generative → Embodied. Each layer answers a distinct question. Each has a crème-de-la-crème tool stack. And a feedback loop ties them all together.

ukubona × who india · session 2 rewrite · 2026

The Pentadic Pipeline

world → perception → agentic → generative → embodied · crème-de-la-crème · generate.sh

Layer I · World · What is true in general?

"What is true across the general evidence space — before you touch a single paper?"

Problem framing
Search
Screen
Extract
Synthesise
Brief
Primary · PECO-F framing
Claude Opus 4 / Sonnet 4.6
Nuanced LMIC reasoning. Best at disambiguating "health financing" from clinical questions. Output: structured PICO + search keyword set.
Sequential stack · Ukubona method
xAI Grok → Gemini → GPT-4o → Claude
Each model in sequence, cumulative prompt: previous output is fed forward. Not parallel — sequential and deliberate.
Step 1 · xAI Grok — Zeitgeist & Real-Time Signal
Grok reads X (Twitter) in real time and is updated on a ~24hr cycle. It captures the living discourse — what practitioners, policymakers, and critics are actually saying now. No other frontier model does this. Start here to ground the question in present reality before any archival pass.
Step 2 · Google Gemini — Archival Depth & Data Moat
Gemini is grounded in Google's unmatched data infrastructure: Search, Maps, YouTube, Scholar, and advertising signals that index human attention at planetary scale. Feed Grok's output here. Gemini anchors the zeitgeist in documented, retrievable evidence — especially strong on WHO, World Bank, and grey government sources.
Step 3 · OpenAI GPT — Abstraction & Structured Reasoning
GPT is the most powerful abstractor in the stack — and precisely because of that, the most prone to confident hallucination when ungrounded. Fed sequentially after Grok and Gemini, its tendency to confabulate is constrained by the prior context it must account for. Use it to push the accumulated evidence toward structured frameworks, ICER tables, and policy logic trees.
Step 4 · Anthropic Claude — Caution, Generation & Code
Claude closes the loop. Generous token window, extreme care with uncertainty, and unmatched at artifact generation: schemas, briefs, code, and structured outputs. The prior three models have grounded and stress-tested the prompt; Claude now builds. This is not the fastest path — it is the most defensible one.
+ 1 · Expert Human in the Loop — No AGI yet
The prompter is not neutral infrastructure. The Ukubona method treats the human expert as the fifth agent: setting the cumulative prompt strategy, reading what each model reveals about its own blind spots, and deciding when the stack has converged. This is a methodology, not a workflow. The sequence encodes a theory of where each model's epistemic character is strongest.
app/schemas/world.py
class WorldInput(BaseModel): question: str # "UHC insurance & OOP spending India" population: str # "adults BPL households" intervention: str # "PM-JAY" comparator: str # "no insurance" outcomes: List[str] # ["ICER","OOP reduction","equity"] context: str # "WHO India financing brief" class WorldOutput(BaseModel): search_queries: List[str] grey_sources: List[str] # HTAIn, NSSO, state NHAs inclusion_criteria: dict equity_flags: List[str] # bias checks to carry forward
Layer II · Perception · What is true in these documents?

"Which real papers and reports actually contain the evidence your question needs?"

Problem framing
Search
Screen
Extract
Synthesise
Brief
Primary · indexed literature
Dimensions.ai + Semantic Scholar
LMIC filters, grant/dataset cross-linking. Semantic Scholar TL;DRs pre-screen relevance. Together they surface what PubMed alone misses.
Primary · grey literature
Claude (PDF upload) + Humata
Unmatched on government PDFs: NSSO, HTAIn assessments, state NHAs, PM-JAY evaluation reports. Claude handles 200k-token documents; Humata for rapid single-doc Q&A.
Secondary · visual network
Connected Papers + Litmaps
Catch seminal works a keyword search misses. Essential for health financing: the citation graph reveals the 3 papers every other paper cites.
Secondary · citation quality
Scite.ai
Distinguishes supporting vs contrasting citations. Surfaces papers that refute PM-JAY findings — critical for equity briefing honesty.
app/services/pubmed.py + pdf.py
from Bio import Entrez import fitz # PyMuPDF Entrez.email = "who-india@example.org" def search_pubmed(query, n=10): h = Entrez.esearch(db="pubmed", term=query, retmax=n) return Entrez.read(h)["IdList"] def extract_pdf(path): doc = fitz.open(path) return "\n".join(p.get_text() for p in doc) # Grey sources: HTAIn, NSSO, state NHAs GREY_URLS = [ "https://htain.icmr.org.in/...", "https://mospi.gov.in/nsso...", ]
Layer III · Agentic · What can be extracted and structured?

"Which findings, numbers, and metrics survive a rigorous extraction pass?"

Problem framing
Search
Screen
Extract
Synthesise
Brief
Primary · structured extraction
Elicit
Still the strongest single agent for health economics extraction. Pulls ICERs, equity metrics, population subgroups, cost-per-DALY tables across dozens of papers simultaneously. Outperforms GPT-4o on column consistency.
Secondary · complex documents
Claude Projects + Humata
For multi-PDF corpus extraction where Elicit doesn't ingest the document type. Claude Projects maintains extraction schema across the entire corpus context window.
PRISMA automation
Rayyan + Nested Knowledge
Rayyan for collaborative title/abstract screening with AI pre-label; Nested Knowledge for life-sciences PRISMA audit trails and meta-analysis setups.
Systematic review infra
DistillerSR
Enterprise-grade PRISMA compliance, HTA context. Use when the output must be defensible to a regulatory or HTA board (HTAIn submissions).
app/schemas/agentic.py
class Extraction(BaseModel): paper_id: str icer: Optional[float] # USD per DALY daly: Optional[float] oop_change: Optional[float] # % change in OOP population: str # "rural BPL adults" equity_note: Optional[str] # "urban bias detected" prisma_stage: PrismaStage # identified|screened|included exclusion_reason: Optional[str]
Identified
Screened
Eligible
Included
Excluded
Layer IV · Generative · What does it mean?

"How do structured findings become a decision-ready narrative?"

Problem framing
Search
Screen
Extract
Synthesise
Brief
Primary · synthesis + brief
Claude Opus 4 / Sonnet 4.6
Unmatched long-context synthesis. Maintains argument coherence across 50+ extracted rows. Calibrates tone for MoHFW Joint Secretary audience. Produces assumption statements, sensitivity narratives, and executive summaries in one pass.
Secondary · multi-doc corpus
NotebookLM (Google)
Excels when source PDFs must remain cited and attributable in the brief. Audio overview feature useful for rapid team orientation on a new corpus.
Summary layer · plain language
SciSpace + Scholarcy
For explaining complex clinical methods to non-specialist policy audiences. SciSpace annotates PDFs in real time; Scholarcy generates flashcard-style triage.
Medical synthesis
OpenEvidence + Evidence Hunt
For clinical evidence threads embedded in health financing questions. OpenEvidence is fastest for evidence-based Q&A from clinical literature.
app/schemas/generative.py
class Synthesis(BaseModel): summary: str # 2–3 sentence executive lead key_findings: List[str] # ordered by policy weight uncertainties: List[str] # "robust if X, fails if Y" equity_summary:str # who benefits, who is excluded budget_note: Optional[str]# fiscal space context brief_draft: str # the policy brief itself
Layer V · Embodied · Should we act on it?

"Does this evidence actually warrant the decision the brief recommends — given this specific context?"

Problem framing
Search
Screen
Extract
Synthesise
Judgment
Primary audit stack · 2026
Human + Claude Projects / Grok-3 / GPT-4o
The "Embodied" layer is not a tool — it is the WHO India health economist applying four checks that no current LLM passes reliably: equity auditing (who is excluded), transferability (does Tamil Nadu ≠ Bihar?), budget reality (fiscal space), and political feasibility (what will MoHFW actually act on).
Check 1 · Equity
Who is excluded from the evidence base? Are the included populations representative of the BPL households PM-JAY targets, or do they proxy urban, insured, or literate populations?
Check 2 · Transferability
A finding from Kerala does not transfer to Bihar without a transferability statement. What infrastructure, literacy, and provider density assumptions does the evidence embed?
Check 3 · Budget reality
Is the recommended intervention within fiscal space? An ICER below the threshold is irrelevant if the Ministry cannot allocate the implementation budget in the current cycle.
Check 4 · Strategic distortion
Who commissioned this evidence? Who benefits from the recommendation? Embodied judgment is the layer that reads the political economy of the evidence — the one place AI cannot yet go.
app/schemas/embodied.py
class Decision(BaseModel): decision: str # adopt | reject | conditional modifications: List[str] # required caveats equity_flags: List[str] # e.g. "urban bias" transferable: bool # to target state/district political_flag:bool # MoHFW action feasibility rerun_trigger: Optional[str]# → refeed to World layer
W
P
A
G
E
W

world · perception · agentic · generative · embodied → rerun if flagged