Correlation vs Causality¶
Decision-grade thinking
Correlation predicts. Causality intervenes.
Correlation answers “what moves together?” Causality answers “what changes what — under intervention?”. If your system will change the world, you need causal structure.
The operational difference
Correlation
Often enough for prediction when environments are stable and you don’t change the system.
Causality
Required for decisions when you will change the system (pricing, policy, treatment, automation).
Two counterfactual statements¶
- If we do \(X=x_1\) instead of \(X=x_0\), the outcome \(Y\) would change.
- If we remove a confounder \(C\), the relationship between \(X\) and \(Y\) may disappear.
Diagram: confounding vs causal effect¶
graph LR;
C["Confounder C"] --> X["X"];
C --> Y["Y"];
X --> Y;
Diagram: interventions change the object¶
flowchart LR;
Obs["Observational data learns P(Y|X)"] --> Pred["Good prediction (sometimes)"];
Int["Intervention needs P(Y|do(X))"] --> Dec["Good decisions"];
Obs -. "not equal" .-> Int;
Common failure mode¶
A predictive model learns \(P(Y|X)\). When you intervene on \(X\), you need \(P(Y|do(X))\). Those are not the same object.
Common traps (and what to do instead)
Confounding
A third variable drives both X and Y. Fix: model confounders explicitly or design an identification strategy.
Selection bias
Your data is a filtered subset of reality. Fix: track selection mechanisms and test robustness.
Distribution shift
The world changes after deployment. Fix: monitor drift and revalidate assumptions continuously.
Policy feedback
Interventions change incentives and behavior. Fix: explicitly model feedback loops and second-order effects.
Where this connects in our stack¶
- Philosophy: don’t confuse predictive accuracy with reliable intervention.
- Methodology: encode causal structure in memory (graphs), then constrain what paths are allowed.
Next: CausalGraphRAG.