Perception AI · Ukubona Evidence Layer

V4 Phase II · Matrix L₀+Σwᵢ·Lᵢ via PDF upload · web scraping · API integrations

The sensory apparatus of the pipeline — ingesting literature at scale across journals, languages, and grey sources, transforming scattered documents into a structured, queryable corpus.

What it is

Targeted ingestion of specific literature

Perception AI handles the acquisition and structuring of documents relevant to a specific research question. Unlike World AI, which understands everything broadly, Perception AI focuses on bringing specific literature into the pipeline — via PDF upload, web scraping, API integrations with journal databases, and connections to grey literature repositories.

The Matrix algebra (L₀+Σwᵢ·Lᵢ) reflects what this layer does: a base corpus (L₀) weighted by relevance scores across literature sources (Lᵢ), constructing the weighted document set that all subsequent phases operate on.

The algebra made human · reference

The bias term b is not a nuisance parameter — it is the institutional prior your director brings before screening begins. The weights wᵢ are the relevance judgments that determine which sources count. One neuron, five layers → unpacks exactly this, without assuming any machine learning background.

Sources it can reach

Journals, languages, and grey sources — all three matter

PubMed / MEDLINE

NCBI API integration; MeSH-structured retrieval

Embase

Drug and clinical trial indexing

EconLit

Health economics, cost-effectiveness literature

WHO IRIS

WHO grey literature, policy documents

NHSRC India

National health systems reports

Cochrane

Systematic review database

Non-English sources

Hindi, French, Portuguese repositories

PDF uploads

Direct document ingestion from local files

Role in the pipeline

Phase II — where the literature becomes data

At Phase II (Matrix), Perception AI performs Title/Abstract Screening — working through the retrieved corpus to identify documents that survive the PICOS filter. This phase accounts for a significant share of the 60–70% of total review time that falls in the red steps.

The transformation here is from unstructured documents to a weighted relevance matrix — a set of documents with confidence scores, ready for Agentic AI to extract data from in Phase III.

WHO India · Practical example

Reviewing PM-JAY's impact on catastrophic health expenditure: Perception AI ingests 2,400 retrieved records from four databases, screens titles and abstracts against inclusion criteria (India-specific, post-2018, quantitative CHE outcomes), and reduces the corpus to 140 full-text candidates — in the time a human team would process roughly 300 abstracts.

Key capability

Language is not a barrier — it is a filter

One of Perception AI's most underused capabilities in global health contexts is multilingual ingestion. Evidence on community health worker programmes in South Asia exists in Hindi, Tamil, Bengali, and regional government reports that never enter English-language indices. Perception AI can ingest, translate, and structure this material alongside English sources.

For WHO India evidence briefs, this means moving closer to complete evidence maps — not just what is indexed in the West, but what is documented in the system itself.

Limitations

Garbage in — the quality of ingestion determines everything downstream

Perception AI does not evaluate quality — it ingests and screens by surface features (title, abstract, metadata). A poorly constructed search strategy at Phase I produces a biased corpus that Perception AI will faithfully structure. The garbage-in problem is real: ingestion fidelity is necessary but not sufficient for a valid review.

World AI Agentic AI