Machine learning in stock selection: Beyond linear thinking
Stock markets are complex, and correlations are not always linear. Nevertheless, traditional models often rely on linear assumptions. So how can this gap be closed? In this interview, Carsten Rother, Co-Head of Research Forecasts, explains how Quoniam has been using machine learning to enhance traditional models.
Key takeaways
-
Not everything takes a linear path: Many relationships on the stock market depend on context.
-
Machine learning complements the basic model: It systematically corrects the blind spots of traditional multi-factor models.
-
Discipline beats complexity: Economic logic, rigorous testing and transparency prevent model over-optimisation.
Carsten, why are traditional linear models no longer sufficient for equity investments today?
A classic linear model – as used in multi-factor approaches – assumes that effects are linear and additive. Put simply, if a factor is twice as strong, it also has twice the effect. And if several factors are positive, their effects simply add up.
This assumption is robust and easy to interpret – which is precisely why it has been the basis of quantitative equity models for decades. The only problem is that reality is often not so straightforward.
Not everything on the stock market is straightforward. That’s exactly where we use machine learning.
Carsten Rother,
Co-Head of Research Forecasts
Non-linearity essentially means that a relationship is not proportional or constant. The effect depends on the context or threshold values.
Let’s take value as an example: a linear model assumes that the cheapest stock has the highest expected return – and everything in between is uniform. In practice, however, we know that some stocks are cheap because their business model has structural problems – “cheap for a reason”.
This means that very low valuations can have completely different effects depending on the quality of the company. It is precisely this context dependency that is a non-linearity.
Can you make that even more tangible?
A very clear example is what are known as thresholds.
Think about debt. The difference between low and moderate debt is often hardly relevant. However, if a company exceeds a critical mark – a threshold – the risk increases disproportionately.
A linear model would say that more debt is uniformly worse. In reality, it’s more like this: for a long time, little happens – and then it tips over.
Such tipping points or interactions between factors are typically non-linear. Machine learning helps us to systematically identify these patterns – not only in individual cases, but across the entire investment universe.
Are you replacing your classic multi-factor model with machine learning?
No. The linear model remains our foundation.
Our approach is deliberately two-stage: first, we create a structured, economically sound linear forecast – specifically, we apply our established multi-factor model based on value, quality and sentiment. These factors are combined to provide a transparent, robust baseline estimate for each stock in the universe.
We then systematically check whether there are any patterns that the model overestimates or underestimates. Technically speaking, we apply machine learning to the residuals of the linear model – i.e. to the unexplained part of the return (see Figure 1). Machine learning thus acts as a corrective, not a replacement. Depending on the interaction or threshold effect identified, the forecast can be adjusted upwards or downwards.
Figure 1: Comparison of machine learning and linear model
In portfolio management, both components remain transparent: the contributions from value, quality and sentiment are visible separately, and the machine learning component appears as an independent building block. The overall forecast is the result of the controlled interaction of both elements.
This approach has been an integral part of our forecasting systems since 2018 and is fully embedded in our research and production process.
Greater model complexity sounds attractive at first. But when does it become a problem?
That’s a valid question. In financial markets, we work with a relatively low signal-to-noise ratio meaning that genuine return signals are often weak and difficult to distinguish from random market fluctuations. It’s easy to find patterns in historical data that exist purely by chance.
That is the real risk: a model may look excellent in the past – but only because it has ‘learned’ coincidences that will not be repeated. This would be dangerous for investors because apparent predictive power would then disappear in the future.
That is why we discipline our models very strictly:
- Clear economic motivation: We start with a substantive hypothesis – not with an algorithm.
- Strict out-of-sample tests: A model must prove itself across different market phases.
- Transparency and monitoring: The machine learning signal is fully attributable and continuously monitored.
For us, complexity is only justified if it delivers stable added value.
What impact does this have on portfolio construction?
The most important change is that alpha becomes state-dependent meaning that expected returns vary depending on market conditions and company characteristics, rather than being constant across all environments. When correlations are non-linear, the reliability of forecasts also varies. Accordingly, we differentiate position sizes more strongly. Exposure should be scaled more heavily where the corrected alpha is robust and stable, and more cautiously where uncertainty is higher.
In addition, factor allocations can be made more flexible if it becomes apparent that certain premiums only work under certain conditions. If factor premiums depend on market regimes or interactions, static factor weights are suboptimal. Portfolio construction must therefore allow factor exposures and risk budgets to adjust over time.
Importantly, this does not lead to black-box optimisation. Risk, turnover and implementability remain clearly controlled.
Machine learning is not a replacement for the linear model – but a precise corrective.
Carsten Rother,
Co-Head of Research Forecasts
What misconception about machine learning do you encounter most often?
Many either expect a revolution or fear a black box. In our view, neither is true.
Stock markets are not a classic AI environment with huge amounts of data and clear patterns. Machine learning does not deliver miracles here, but rather gradual improvements.
The added value comes from a combination of economic structure, a disciplined research process and controlled integration – not from maximum complexity.
Summary
- Linear multi-factor models are robust and remain the foundation of quantitative equity strategies.
- Machine learning complements this approach in specific areas where correlations are not proportional: thresholds, interactions and state-dependent effects.
- Since 2018, Quoniam has been using this two-stage approach to systematically refine forecasts – transparently, disciplined and with a clear focus on stability.