Level 2 Β· Exploration
When Numbers Can Be Pooled
When is statistical pooling of health economics evidence appropriate β and when does a pooled estimate create false precision that misleads policy?
A pooled estimate from a meta-analysis has the appearance of authority: a single number, a confidence interval, a forest plot. It feels more definitive than five separate study findings pointing in different directions. This authority is often warranted. Sometimes it is not β and the damage from pooling studies that should not be pooled is worse than the messiness of acknowledging heterogeneity, because a false pooled estimate enters a brief with a precision it has not earned.
The classic objection to inappropriate pooling is the fruit salad problem: combining apples and oranges produces neither a better apple nor a better orange. It produces a meaningless average that describes nothing that actually exists. For health economics evidence on financial risk protection in India, the fruit salad risk is acute: a meta-analysis pooling community-based health insurance studies from Ghana, Rwanda, Kenya, and two Indian states is not evidence about India β it is evidence about the average of four different health systems that share almost nothing except the label "LMIC."
Before any set of studies can legitimately be pooled, they must pass three gates. Failing any one of them means pooling should not proceed β the appropriate output is a narrative synthesis with explicit statement of why statistical combination is not justified.
When you ask an AI tool to summarise a meta-analysis, it will typically report the pooled estimate and confidence interval β the bottom diamond on the forest plot. It will usually not flag a high IΒ², will not note whether individual study estimates cross the line of no effect while the pooled estimate does not, and will not raise the question of whether the studies should have been pooled at all. These are your jobs.
A finding of high heterogeneity or incompatible study designs is not a dead end β it is a result. The appropriate response is a narrative synthesis that does three things the pooled estimate cannot:
| Situation | Appropriate synthesis | What to report |
|---|---|---|
| Studies comparable, IΒ² < 50% | Pool Fixed or random effects meta-analysis | Pooled estimate, CI, IΒ², sensitivity analysis removing outlier studies |
| Studies comparable, 50% β€ IΒ² β€ 75% | Caution Pool with subgroup analysis | Pooled estimate with explicit IΒ² caveat; subgroup results by region or intervention type; investigation of heterogeneity sources |
| High heterogeneity IΒ² > 75% | Don't pool Narrative synthesis | Direction and range of effects across studies; explicit statement of why pooling is not appropriate; subgroup by India/South Asia if data allow |
| Methodologically incompatible outcomes | Don't pool Narrative synthesis | Tabular presentation of each study's effect estimate with outcome definition; note incompatibility explicitly in brief methodology note |
| India estimate crosses null, LMIC pool significant | Disaggregate Report India separately | "The pooled LMIC estimate suggests benefit; the single available India study (PM-JAY) shows no significant effect on financial protection [CI crosses null]. India-specific evidence is insufficient to draw a conclusion." This is the most honest and policy-useful summary. |
Describe the studies you are considering pooling β their countries, intervention types, outcome definitions, and any IΒ² or heterogeneity statistics you have. The tool will assess whether pooling is legitimate and recommend the appropriate synthesis approach.
A pooled estimate is not automatically more reliable than individual study findings β it is only more reliable if the studies were sufficiently similar to pool. The three gates (clinical, methodological, and statistical homogeneity) are the tests. An IΒ² above 75% is the evidence that the studies are telling different stories, not the same one. For WHO India work, the most common and most consequential error is citing a pooled LMIC estimate as India evidence when the India-specific study in the pool shows no significant effect. Report India separately. The heterogeneity is the finding. Session 5 closes Level 2 with surveillance β how to keep your evidence base current once you have built it.