AI × Quant Trader Series — Day 7¶
The Swiss Army Knife of Linear Models: Lasso Regression¶
Reading time: ~15 minutes
Prerequisites: basic linear algebra, Python, NumPy
Focus: engineering intuition, quant usage (not ML hype)
Part 1: Introduction to Regularized Linear Models¶
We now move from data processing to one of the most important modeling tools in quantitative trading and applied machine learning: regularized linear models.
In real-world financial modeling, the main difficulty is rarely computation. Instead, it is almost always structure:
- Too many features
- Strong multicollinearity
- Limited samples
- High noise-to-signal ratio
A plain linear regression model can fit the data extremely well in-sample, yet fail catastrophically out-of-sample.
This is where Lasso regression becomes indispensable.
Part 2: From Linear Regression to Lasso¶
2.1 Ordinary Least Squares (OLS)¶
The objective function of ordinary least squares is:
OLS attempts to minimize prediction error only.
It places no constraint on model complexity.
As a result:
- Coefficients become unstable when features are correlated
- Noise features receive non-zero weights
- Overfitting is almost guaranteed in high-dimensional settings
2.2 Why Regularization Is Necessary¶
In quantitative finance, feature sets often include:
- Dozens of technical indicators
- Overlapping factors
- Lagged signals
Many of these features carry redundant or spurious information.
Regularization explicitly penalizes complexity, forcing the model to prefer simpler and more stable solutions.
Part 3: Lasso Regression — Core Idea¶
3.1 Objective Function¶
Lasso (Least Absolute Shrinkage and Selection Operator) modifies OLS by adding an L1 penalty:
Where:
- The first term measures fit quality
- The second term penalizes coefficient magnitude
- \(\lambda\) controls the strength of regularization
3.2 What Makes Lasso Different¶
Unlike Ridge regression (L2 regularization), Lasso drives some coefficients exactly to zero.
This leads to:
- Automatic feature selection
- Sparse models
- Improved interpretability
From an engineering perspective:
Lasso is not just a regression model — it is a structured filter.
Part 4: Intuition — Why Lasso Produces Sparsity¶
The L1 penalty creates a sharp constraint geometry.
When optimization occurs under this constraint, solutions naturally land on coordinate axes.
The practical consequence is simple:
Unimportant features are dropped entirely.
This behavior is extremely valuable in quant trading, where fewer signals often outperform noisy combinations.
Part 5: Implementing Lasso in Python¶
We now implement Lasso using scikit-learn.
Imports¶
import numpy as np
import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.preprocessing import StandardScaler
5.1 Generate Example Data¶
import numpy as np
np.random.seed(42)
X = np.random.randn(100, 10)
true_beta = np.array([3, 0, 0, 1.5, 0, 0, 0, 2, 0, 0])
y = X @ true_beta + np.random.randn(100) * 0.5
5.2 Standardize Features¶
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
5.3 Fit the Lasso Model¶
from sklearn.linear_model import Lasso
import pandas as pd
lasso = Lasso(alpha=0.1)
lasso.fit(X_scaled, y)
pd.Series(lasso.coef_)
Part 6: The Role of Alpha (λ)¶
6.1 Effect of Regularization Strength¶
Small α → weak regularization → overfitting
Large α → aggressive shrinkage → underfitting
for a in [0.01, 0.1, 1.0]:
model = Lasso(alpha=a)
model.fit(X_scaled, y)
print(a, (model.coef_ != 0).sum())
6.2 Cross-Validation (Recommended)¶
from sklearn.linear_model import LassoCV
lasso_cv = LassoCV(cv=5)
lasso_cv.fit(X_scaled, y)
lasso_cv.alpha_
lasso_cv.coef_
Part 7: Limitations of Lasso¶
Lasso is not universally optimal:
Performs poorly when features are highly correlated
Cannot model non-linear interactions
Sensitive to outliers
Common remedies include:
Elastic Net (L1 + L2)
PCA + Lasso
Lasso for feature selection followed by non-linear models