
Factor investing promised to bring scientific precision to markets by explaining why some stocks outperform. But after years of disappointing results, researchers are realizing that the issue will not be in the info in any respect; It’s the way in which models are built. A brand new study suggests that many factor models confuse correlation with causation, making a “factor mirage.”
Factor investing grew from a chic idea: Markets reward exposure to certain non-diversifiable risks—value, momentum, quality, size—that designate why some assets outperform others. Since then, trillions of dollars have been allocated to products based on this premise.
The data tells a sobering story. The Bloomberg-Goldman Sachs US Equity Multi-Factor Index, which tracks the long-short performance of premiums within the classic style, has delivered a result Sharpe ratio of just 0.17 since 2007 (t-stat=0.69, p-value=0.25), statistically indistinguishable from zero before costs. In plain language: factor investing has not brought any added value to investors. For fund managers who’ve developed products based on these models, this deficiency results in years of underperformance and lost trust.
Why the backtests are misleading
Most factor models are developed using an econometric canon—linear regressions, significance tests, two-pass estimators—that mix association with causation. Econometric textbooks Teach students that regressions should include all variables related to returns, whatever the role the variable plays within the causal mechanism.
This is a methodological error. Including a (a variable affected by each the factor and returns) and/or excluding a (a variable affected by each the factor and returns) biases the estimates of the coefficients.
This distortion can reverse the sign of an element’s coefficient. Investors then buy securities that they need to have sold, and vice versa. Even if all risk premiums are stable and estimated appropriately, an incorrectly specified model can produce results systematic losses.
The Mirage factor
The “factor zoo” is a widely known phenomenon: a whole bunch of published anomalies that fail out of sample. ADIA Lab researchers point to a more subtle and dangerous problem: the “fata morgana factor.” It comes not from data mining, but from models which might be misspecified despite being developed in keeping with the econometric canon taught in textbooks.
Models with colliders are of particular concern, as these often also exist as appropriately specified models. The econometric canon favors such mis-specified models, confusing higher fit with correctness.
In an element model with a collider, the return value is ready before the worth of the collider. As a result, the stronger association resulting from the collider can’t be monetized. The profits promised in these scientific works are a mirage. In practice, this methodological error has billions of dollars in consequences.
For example, consider two researchers estimating a top quality rating. One of the researchers examines profitability, leverage and size; the opposite adds return on equity, a variable influenced by each profitability (the factor) and stock performance (the final result).
By including a collider, the second researcher makes a false connection: top quality now correlates with high returns previously. In the backtest, the second model appears to be superior. In live trading the tables are turned, the backtest is a statistical illusion that silently siphons away capital. For individual managers, these mistakes can silently reduce returns; For markets as a complete, they distort capital allocation and create inefficiencies on a world scale.
When misspecifications turn into a systemic risk
Misspecification of the model has several consequences.
- Capital misallocation: Trillions of dollars are driven by models that confuse association with causation, a statistical error with enormous financial consequences.
- Hidden correlation: Portfolios based on similar, mis-specified aspects share common risks, increasing systemic fragility.
- Loss of trust: Any backtest that fails in live trading undermines investor confidence in quantitative methods as a complete.
The latest work from ADIA Lab goes even further: it shows that Without causal factor models, no portfolio will be efficient. If the underlying aspects are misspecified, even perfect estimates of means and covariances will lead to suboptimal portfolios. This implies that investments usually are not only a prediction problem and that increasing complexity doesn’t improve the model.
What can investors do in a different way?
The predicament of factor investing can’t be solved with more data or more complex methods. What is required most is causal considerations. Causal inference provides practical steps that each allocator can apply now:
- Demand a causal justification. Before accepting a model, ask: Have the authors explained the causal mechanism? Does the causal graph agree with our understanding of the world? Does the causal graph agree with the empirical evidence? Are the chosen controls sufficient to eliminate confounder bias?
- Identify disruptive aspects and avoid colliders. Confounding aspects needs to be controlled; Colliders shouldn’t. Without a causal diagram, researchers cannot tell the difference. Causal discovery tools might help narrow the set of causal diagrams which might be consistent with the info.
- The significance is misleading. A model that explains less variance but conforms to a plausible causal structure is more reliable than one with a jaw-dropping R². In practice, a stronger connection doesn’t necessarily mean greater profitability.
- Test for causal stability. One causal factor should remain vital across all regimes. If a “premium” changes sign after each crisis, the likely cause is misspecification reasonably than shifting risk compensation.
From association to understanding
The financial world just isn’t alone on this transition. Medicine moved from correlation to causation a long time ago, turning guesswork into evidence-based treatment. Epidemiology, policy evaluation, and machine learning have all embraced causal pondering. Now it is the turn of funds.
The goal just isn’t scientific purity; it’s practical reliability. A causal model identifies the true sources of risk and return, allowing investors to allocate capital efficiently and credibly explain performance.
The way forward
For investors, this shift is greater than just academic. It’s about developing strategies that delay in the actual world – models that designate that they work, not only that they work. In an age of knowledge abundance, understanding cause and effect will be the only real profit left.
Factor investing can still fulfill its original scientific promise, but provided that it leaves behind the habits that gave rise to the factor fantasy. The next generation of investment research should be rebuilt on causal foundations:
- Declare causal graphs based on a mixture of domain expertise and causal detection methods.
- Justify each variable inclusion with economic logic, consistent with the causal graph and the appliance of do-calculus rules.
- Evaluate strategies through counterfactuals: What returns would have been achieved on other exposures?
- Monitor structural breaks within the causal relationship: As soon because the break becomes apparent in performance, it’s already too late.
- Today, markets are flooded with data but lack understanding. Machine learning can map relationships across thousands and thousands of variables, but without causality it results in false discoveries. The real edge within the age of AI will lie not in larger data sets or more complex algorithms, but in higher causal models that accurately map returns to their true causes.
If factor investing is to regain investor confidence, it must evolve from the phenomenological description of patterns to their causal explanation and shift the main target from correlation to causation. This shift will mark the moment when quantitative investing becomes not only systematic but truly scientific.
