Perception AI · Phase II · Matrix

Perception AI

V4 Phase II · Matrix L₀+Σwᵢ·Lᵢ via PDF upload · web scraping · API integrations

The sensory apparatus of the pipeline — ingesting literature at scale across journals, languages, and grey sources, transforming scattered documents into a structured, queryable corpus.

Targeted ingestion of specific literature

Perception AI handles the acquisition and structuring of documents relevant to a specific research question. Unlike World AI, which understands everything broadly, Perception AI focuses on bringing specific literature into the pipeline — via PDF upload, web scraping, API integrations with journal databases, and connections to grey literature repositories.

The Matrix algebra (L₀+Σwᵢ·Lᵢ) reflects what this layer does: a base corpus (L₀) weighted by relevance scores across literature sources (Lᵢ), constructing the weighted document set that all subsequent phases operate on.

The algebra made human · reference

The bias term b is not a nuisance parameter — it is the institutional prior your director brings before screening begins. The weights wᵢ are the relevance judgments that determine which sources count. One neuron, five layers → unpacks exactly this, without assuming any machine learning background.

Journals, languages, and grey sources — all three matter

PubMed / MEDLINE
NCBI API integration; MeSH-structured retrieval
Embase
Drug and clinical trial indexing
EconLit
Health economics, cost-effectiveness literature
WHO IRIS
WHO grey literature, policy documents
NHSRC India
National health systems reports
Cochrane
Systematic review database
Non-English sources
Hindi, French, Portuguese repositories
PDF uploads
Direct document ingestion from local files

Phase II — where the literature becomes data

At Phase II (Matrix), Perception AI performs Title/Abstract Screening — working through the retrieved corpus to identify documents that survive the PICOS filter. This phase accounts for a significant share of the 60–70% of total review time that falls in the red steps.

The transformation here is from unstructured documents to a weighted relevance matrix — a set of documents with confidence scores, ready for Agentic AI to extract data from in Phase III.

WHO India · Practical example

Reviewing PM-JAY's impact on catastrophic health expenditure: Perception AI ingests 2,400 retrieved records from four databases, screens titles and abstracts against inclusion criteria (India-specific, post-2018, quantitative CHE outcomes), and reduces the corpus to 140 full-text candidates — in the time a human team would process roughly 300 abstracts.

Language is not a barrier — it is a filter

One of Perception AI's most underused capabilities in global health contexts is multilingual ingestion. Evidence on community health worker programmes in South Asia exists in Hindi, Tamil, Bengali, and regional government reports that never enter English-language indices. Perception AI can ingest, translate, and structure this material alongside English sources.

For WHO India evidence briefs, this means moving closer to complete evidence maps — not just what is indexed in the West, but what is documented in the system itself.

Garbage in — the quality of ingestion determines everything downstream

Perception AI does not evaluate quality — it ingests and screens by surface features (title, abstract, metadata). A poorly constructed search strategy at Phase I produces a biased corpus that Perception AI will faithfully structure. The garbage-in problem is real: ingestion fidelity is necessary but not sufficient for a valid review.