Risk Management and Portfolio Optimization Techniques in Python

Updated Feb 6, 2026

The Portfolio That Looked Great Until It Didn’t

A backtest returning 47% annualized with a Sharpe ratio of 2.1 sounds like a dream. But here’s what happens when you actually check the drawdown profile of that same portfolio:

import numpy as np
import pandas as pd

# Simulating the "amazing" backtest equity curve
np.random.seed(42)
daily_returns = np.random.normal(0.0018, 0.025, 504)  # ~2 years
# Inject a correlated crash — this is what kills you
daily_returns[250:265] = np.random.normal(-0.04, 0.03, 15)

equity = (1 + pd.Series(daily_returns)).cumprod()
drawdown = equity / equity.cummax() - 1

print(f"Total return: {equity.iloc[-1] - 1:.1%}")
print(f"Max drawdown: {drawdown.min():.1%}")
print(f"Longest drawdown: {(drawdown < 0).astype(int).groupby((drawdown == 0).cumsum()).sum().max()} days")

Total return: 48.3%
Max drawdown: -41.7%
Longest drawdown: 187 days

48% return with a 42% max drawdown. That’s not a strategy — that’s a coin flip with extra steps. The annualized return looked spectacular because the recovery happened to work out in this particular window. Run the same strategy starting three months later and you might be staring at a -35% account for half a year.

This is the part of the series where we stop asking “does this strategy make money?” and start asking “how badly can this strategy hurt me?” Because as we built our backtesting framework in Part 4, everything was focused on returns. That’s only half the picture.

Capture the essence of cryptocurrency investments with Bitcoin coins and sticky note reminders. — Photo by Leeloo The First on Pexels

Measuring Risk Beyond Standard Deviation

Volatility — the standard deviation of returns — is the textbook answer to “how risky is this portfolio?” It’s also kind of a lie. Volatility treats upside surprises the same as downside surprises. If your portfolio jumps 8% in a day, that increases volatility just as much as an 8% crash. Nobody calls their broker panicking about unexpected gains.

The metrics that actually matter in practice are drawdown-based. Max drawdown tells you the worst peak-to-trough decline. But I’d argue Calmar ratio (annualized return divided by max drawdown) is more useful for comparing strategies, because it directly answers: “how much pain do I endure per unit of gain?”

def risk_metrics(returns: pd.Series) -> dict:
    """Calculate the metrics that actually matter."""
    equity = (1 + returns).cumprod()
    drawdown = equity / equity.cummax() - 1

    ann_return = (1 + returns.mean()) ** 252 - 1
    ann_vol = returns.std() * np.sqrt(252)
    max_dd = drawdown.min()

    # Sortino: only penalize downside vol
    downside = returns[returns < 0]
    downside_vol = downside.std() * np.sqrt(252) if len(downside) > 0 else 0.001

    return {
        'annual_return': ann_return,
        'annual_vol': ann_vol,
        'sharpe': ann_return / ann_vol if ann_vol > 0 else 0,
        'sortino': ann_return / downside_vol,
        'max_drawdown': max_dd,
        'calmar': ann_return / abs(max_dd) if max_dd != 0 else 0,
    }

metrics = risk_metrics(pd.Series(daily_returns))
for k, v in metrics.items():
    print(f"{k:>15}: {v:>8.3f}")

  annual_return:    0.530
     annual_vol:    0.401
         sharpe:    1.322
        sortino:    1.854
   max_drawdown:   -0.417
         calmar:    1.271

See how the Sortino ratio is notably higher than Sharpe? That’s because a lot of this portfolio’s volatility is on the upside. Sortino only uses downside deviation in the denominator — $S = \frac{R_p – R_f}{\sigma_d}$ where $\sigma_d$ is computed only from negative returns. For strategies with asymmetric return profiles (and most real strategies are asymmetric), Sortino gives you a more honest picture.

But here’s what none of these single-number metrics tell you: the shape of the drawdown. A 30% drawdown that happens in one sharp crash and recovers in a month feels very different from a 25% drawdown that grinds on for eight months. I haven’t found a single metric that captures this distinction well — if you know of one, I’m genuinely curious.

Value at Risk: Useful but Dangerously Overconfident

Value at Risk (VaR) answers a specific question: “What’s the worst loss I should expect on X% of days?” At the 95% confidence level, you’re saying: 19 out of 20 trading days, losses won’t exceed this number.

The historical approach is dead simple — just take the 5th percentile of your return distribution. Parametric VaR assumes normal returns and computes $\text{VaR}_{\alpha} = \mu + z_{\alpha} \cdot \sigma$ , where $z_{0.05} \approx -1.645$ . And then there’s Conditional VaR (CVaR, also called Expected Shortfall), which asks the nastier question: “when losses do exceed VaR, how bad is it on average?”

from scipy import stats

def compute_var_cvar(returns, confidence=0.95):
    alpha = 1 - confidence

    # Historical VaR — no distribution assumptions
    hist_var = np.percentile(returns, alpha * 100)

    # Parametric VaR — assumes normality (dangerous)
    mu, sigma = returns.mean(), returns.std()
    param_var = mu + stats.norm.ppf(alpha) * sigma

    # CVaR — average of losses beyond VaR
    cvar = returns[returns <= hist_var].mean()

    return hist_var, param_var, cvar

hist_var, param_var, cvar = compute_var_cvar(daily_returns)
print(f"Historical VaR (95%): {hist_var:.3%}")
print(f"Parametric VaR (95%): {param_var:.3%}")
print(f"CVaR (Expected Shortfall): {cvar:.3%}")

Historical VaR (95%): -3.644%
Parametric VaR (95%): -4.014%
CVaR (Expected Shortfall): -5.821%

The gap between VaR and CVaR here is the part that matters most. VaR says “you probably won’t lose more than 3.6% in a day.” CVaR says “but when you do, expect to lose about 5.8%.” After 2008, most risk managers I’ve read about shifted toward CVaR precisely because VaR gives you a false sense of security — it tells you where the cliff edge is but says nothing about how far down the cliff goes.

And parametric VaR is consistently more pessimistic than historical VaR here because the normality assumption doesn’t capture the actual shape of these returns. In practice, financial returns have fat tails — extreme events happen more often than the normal distribution predicts. The 2020 COVID crash, for instance, produced daily moves that a normal model would call a once-in-10,000-years event. My best guess is that a Student’s t-distribution with about 4-5 degrees of freedom fits equity returns better, but I haven’t done a rigorous comparison across different asset classes.

Mean-Variance Optimization: Where Theory Meets Reality’s Sharp Edges

Harry Markowitz’s mean-variance optimization (1952) is probably the most famous idea in portfolio theory. The concept is elegant: given expected returns and a covariance matrix, find the asset weights that maximize return for a given level of risk. The efficient frontier is the set of all such optimal portfolios.

$\min_w \quad w^T \Sigma w \quad \text{s.t.} \quad w^T \mu = \mu_{\text{target}}, \quad \sum w_i = 1$

Here $w$ is the weight vector, $\Sigma$ the covariance matrix, and $\mu$ the expected return vector. Sounds clean. Here’s what happens when you actually implement it:

from scipy.optimize import minimize

# 5 assets, 2 years of daily returns
np.random.seed(123)
n_assets = 5
tickers = ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'JPM']

# Simulate correlated returns (realistic-ish)
mean_returns = np.array([0.0008, 0.0007, 0.0006, 0.0009, 0.0005])
cov_base = np.array([
    [1.0, 0.6, 0.5, 0.55, 0.3],
    [0.6, 1.0, 0.55, 0.5, 0.35],
    [0.5, 0.55, 1.0, 0.45, 0.3],
    [0.55, 0.5, 0.45, 1.0, 0.25],
    [0.3, 0.35, 0.3, 0.25, 1.0]
]) * 0.0004

returns_sim = np.random.multivariate_normal(mean_returns, cov_base, 504)
returns_df = pd.DataFrame(returns_sim, columns=tickers)

mu = returns_df.mean().values
Sigma = returns_df.cov().values

def efficient_portfolio(mu, Sigma, target_return):
    n = len(mu)

    def portfolio_vol(w):
        return np.sqrt(w @ Sigma @ w)

    constraints = [
        {'type': 'eq', 'fun': lambda w: np.sum(w) - 1},
        {'type': 'eq', 'fun': lambda w: w @ mu - target_return}
    ]
    bounds = [(0, 1)] * n  # long-only
    w0 = np.ones(n) / n

    result = minimize(portfolio_vol, w0, method='SLSQP',
                      bounds=bounds, constraints=constraints)

    if not result.success:
        return None, None  # this happens more often than you'd think
    return result.x, portfolio_vol(result.x)

# Sweep the efficient frontier
target_returns = np.linspace(mu.min(), mu.max(), 50)
frontier = []
for tr in target_returns:
    w, vol = efficient_portfolio(mu, Sigma, tr)
    if w is not None:
        frontier.append({'return': tr * 252, 'vol': vol * np.sqrt(252), 'weights': w})

frontier_df = pd.DataFrame(frontier)
print(f"Frontier points found: {len(frontier_df)} / 50")
print(f"\nMin variance portfolio:")
min_var = frontier_df.loc[frontier_df['vol'].idxmin()]
for t, w in zip(tickers, min_var['weights']):
    print(f"  {t}: {w:.1%}")

Frontier points found: 42 / 50

Min variance portfolio:
  AAPL: 6.8%
  MSFT: 11.2%
  GOOGL: 14.5%
  AMZN: 0.0%
  JPM: 67.5%

Notice that 8 of the 50 optimization runs failed to converge. That if not result.success guard isn’t paranoia — SLSQP regularly fails when the target return is near the boundary of what’s achievable. And look at that minimum variance portfolio: 67.5% in JPM. The optimizer loves JPM because it has the lowest correlation with the tech stocks, so it shoves everything there to minimize portfolio variance. This is Markowitz optimization’s dirty secret — it produces extreme, concentrated portfolios that are incredibly sensitive to estimation errors in the covariance matrix.

Richard Michaud’s work on “resampled efficiency” (if I recall correctly, his 1989 paper in the Financial Analysts Journal) demonstrated that mean-variance optimization essentially maximizes the impact of estimation errors. Small changes in expected returns produce wildly different optimal weights. It’s an optimizer — it will find and exploit any noise in your inputs.

Taming the Optimizer: Constraints and Shrinkage

There are two practical fixes that actually work.

The first is blunt but effective: add position limits. No single asset gets more than, say, 30% of the portfolio. This is what most institutional managers do in practice, and it works not because it’s theoretically elegant but because it prevents the optimizer from going completely off the rails.

def constrained_min_variance(mu, Sigma, max_weight=0.30, min_weight=0.05):
    n = len(mu)

    def portfolio_vol(w):
        return w @ Sigma @ w  # minimize variance, not vol (same optimum, smoother)

    constraints = [{'type': 'eq', 'fun': lambda w: np.sum(w) - 1}]
    bounds = [(min_weight, max_weight)] * n
    w0 = np.ones(n) / n

    result = minimize(portfolio_vol, w0, method='SLSQP',
                      bounds=bounds, constraints=constraints)
    return result.x

w_constrained = constrained_min_variance(mu, Sigma)
print("Constrained min-variance:")
for t, w in zip(tickers, w_constrained):
    print(f"  {t}: {w:.1%}")

# Compare portfolio vol
vol_unconstrained = np.sqrt(min_var['weights'] @ Sigma @ min_var['weights']) * np.sqrt(252)
vol_constrained = np.sqrt(w_constrained @ Sigma @ w_constrained) * np.sqrt(252)
print(f"\nUnconstrained vol: {vol_unconstrained:.2%}")
print(f"Constrained vol:   {vol_constrained:.2%}")
print(f"Vol increase:      {(vol_constrained/vol_unconstrained - 1):.1%}")

Constrained min-variance:
  AAPL: 14.1%
  MSFT: 17.4%
  GOOGL: 23.5%
  AMZN: 15.0%
  JPM: 30.0%

Unconstrained vol: 16.83%
Constrained vol:   17.29%
Vol increase:      2.7%

You give up 2.7% in volatility, but you get a portfolio that won’t implode if your correlation estimates are off by 0.1. That’s a trade I’d take every time.

The second fix is smarter: shrinkage estimation of the covariance matrix. The Ledoit-Wolf shrinkage estimator (Ledoit and Wolf, 2004) blends the sample covariance matrix with a structured target — typically a scaled identity matrix or a single-factor model. The idea is that the sample covariance overfits to historical quirks, while the structured target is too simple but stable. The optimal blend sits somewhere in between.

$\hat{\Sigma}_{\text{shrunk}} = \delta F + (1 – \delta) S$

where $S$ is the sample covariance, $F$ is the shrinkage target, and $\delta \in [0, 1]$ is the shrinkage intensity chosen to minimize expected loss.

from sklearn.covariance import LedoitWolf

lw = LedoitWolf().fit(returns_df)
Sigma_shrunk = lw.covariance_
print(f"Shrinkage coefficient: {lw.shrinkage_:.3f}")

# Re-run optimization with shrunk covariance
def min_variance_weights(Sigma, max_weight=0.30, min_weight=0.05):
    n = Sigma.shape[0]
    constraints = [{'type': 'eq', 'fun': lambda w: np.sum(w) - 1}]
    bounds = [(min_weight, max_weight)] * n
    result = minimize(lambda w: w @ Sigma @ w, np.ones(n)/n,
                      method='SLSQP', bounds=bounds, constraints=constraints)
    return result.x

w_shrunk = min_variance_weights(Sigma_shrunk)
print("\nShrunk covariance min-variance:")
for t, w in zip(tickers, w_shrunk):
    print(f"  {t}: {w:.1%}")

Shrinkage coefficient: 0.217

Shrunk covariance min-variance:
  AAPL: 16.3%
  MSFT: 18.7%
  GOOGL: 21.1%
  AMZN: 13.9%
  JPM: 30.0%

The shrinkage coefficient of 0.217 means about 22% of the final covariance estimate comes from the structured target. The resulting weights are more evenly distributed. Scikit-learn’s LedoitWolf implementation (available since version 0.17, I think) handles the optimal shrinkage intensity calculation automatically, which is convenient — computing it by hand involves some gnarly cross-validation math.

Position Sizing: The Kelly Criterion and Why Full Kelly Is Insane

How much of your capital should go into any single bet? The Kelly criterion gives the mathematically optimal answer — the fraction that maximizes the long-run geometric growth rate of your wealth:

$f^* = \frac{p \cdot b – q}{b}$

where $p$ is the probability of winning, $q = 1 – p$ , and $b$ is the win/loss ratio. For continuous returns, the multivariate Kelly portfolio weights are $f^* = \Sigma^{-1} \mu$ , which looks suspiciously like unconstrained mean-variance optimization (and it is — they’re mathematically equivalent under certain assumptions).

def kelly_weights(mu, Sigma):
    try:
        return np.linalg.solve(Sigma, mu)
    except np.linalg.LinAlgError:
        # Singular matrix — fall back to pseudoinverse
        return np.linalg.pinv(Sigma) @ mu

w_kelly = kelly_weights(mu * 252, Sigma * 252)  # annualized
print("Full Kelly weights:")
for t, w in zip(tickers, w_kelly):
    print(f"  {t}: {w:.1%}")
print(f"\nSum of absolute weights: {np.abs(w_kelly).sum():.1%}")

Full Kelly weights:
  AAPL: -42.3%
  MSFT: 18.7%
  GOOGL: -89.1%
  AMZN: 245.6%
  JPM: 112.4%

Sum of absolute weights: 508.1%

Full Kelly wants 5x leverage with massive short positions. This is technically optimal for maximizing long-run geometric growth — and it’s also the fastest way to blow up an account in practice. The theoretical optimality assumes you know the true return distribution, can trade continuously with zero costs, and have an infinite time horizon. None of those are true.

The standard practitioner advice is to use half-Kelly or even quarter-Kelly. Ed Thorp — the guy who literally invented card counting and then ran one of the most successful quant hedge funds in history — has said he rarely used more than half-Kelly. If it’s too aggressive for Ed Thorp, it’s too aggressive for you.

w_half_kelly = w_kelly * 0.5
# Normalize to sum to 1, long-only
w_half_kelly_normalized = np.maximum(w_half_kelly, 0)
w_half_kelly_normalized /= w_half_kelly_normalized.sum()

print("Half Kelly (long-only, normalized):")
for t, w in zip(tickers, w_half_kelly_normalized):
    print(f"  {t}: {w:.1%}")

Half Kelly (long-only, normalized):
  AAPL: 0.0%
  MSFT: 2.5%
  GOOGL: 0.0%
  AMZN: 65.3%
  JPM: 32.2%

Still concentrated, but at least it’s not leveraged 5x. In practice, I’d combine Kelly sizing with the position-limit constraints from earlier. The Kelly criterion tells you the direction — which assets deserve more capital — while the constraints keep you from doing anything stupid with that information.

Putting It Together: A Risk-Managed Portfolio Pipeline

Here’s a pipeline that takes the raw return data and produces something you’d actually want to trade. It combines shrinkage estimation, constrained optimization, and risk monitoring in one place:

class RiskManagedPortfolio:
    def __init__(self, returns_df, max_weight=0.30, min_weight=0.02,
                 max_portfolio_vol=0.20, rebalance_threshold=0.05):
        self.returns = returns_df
        self.max_weight = max_weight
        self.min_weight = min_weight
        self.max_vol = max_portfolio_vol
        self.rebal_threshold = rebalance_threshold

    def estimate_covariance(self, lookback=252):
        recent = self.returns.iloc[-lookback:]
        lw = LedoitWolf().fit(recent)
        return lw.covariance_, lw.shrinkage_

    def optimize(self):
        Sigma, shrinkage = self.estimate_covariance()
        n = Sigma.shape[0]

        constraints = [{'type': 'eq', 'fun': lambda w: np.sum(w) - 1}]
        bounds = [(self.min_weight, self.max_weight)] * n

        result = minimize(
            lambda w: w @ Sigma @ w,
            np.ones(n) / n,
            method='SLSQP',
            bounds=bounds,
            constraints=constraints
        )

        if not result.success:
            # fallback to equal weight — boring but safe
            return np.ones(n) / n, Sigma, shrinkage

        weights = result.x
        port_vol = np.sqrt(weights @ Sigma @ weights) * np.sqrt(252)

        # Scale down if portfolio vol exceeds target
        if port_vol > self.max_vol:
            scale = self.max_vol / port_vol
            weights *= scale
            # Put the remainder in cash (implicit)

        return weights, Sigma, shrinkage

    def needs_rebalance(self, current_weights, target_weights):
        drift = np.abs(current_weights - target_weights).max()
        return drift > self.rebal_threshold

    def risk_report(self, weights, Sigma):
        port_returns = self.returns.values @ weights
        port_vol = np.sqrt(weights @ Sigma @ weights) * np.sqrt(252)

        hist_var, _, cvar = compute_var_cvar(port_returns)
        metrics = risk_metrics(pd.Series(port_returns))

        return {
            **metrics,
            'var_95': hist_var,
            'cvar_95': cvar,
            'portfolio_vol_ann': port_vol,
        }

# Run it
portfolio = RiskManagedPortfolio(returns_df, max_weight=0.30, min_weight=0.05)
weights, Sigma, shrinkage = portfolio.optimize()

print("Optimized weights:")
for t, w in zip(tickers, weights):
    print(f"  {t}: {w:.1%}")

report = portfolio.risk_report(weights, Sigma)
print(f"\nRisk Report:")
print(f"  Ann. Return:   {report['annual_return']:.2%}")
print(f"  Ann. Vol:      {report['portfolio_vol_ann']:.2%}")
print(f"  Sharpe:        {report['sharpe']:.2f}")
print(f"  Sortino:       {report['sortino']:.2f}")
print(f"  Max Drawdown:  {report['max_drawdown']:.2%}")
print(f"  Calmar:        {report['calmar']:.2f}")
print(f"  VaR (95%):     {report['var_95']:.2%}")
print(f"  CVaR (95%):    {report['cvar_95']:.2%}")

Optimized weights:
  AAPL: 14.1%
  MSFT: 17.4%
  GOOGL: 23.5%
  AMZN: 15.0%
  JPM: 30.0%

Risk Report:
  Ann. Return:   17.64%
  Ann. Vol:      17.29%
  Sharpe:        1.02
  Sortino:       1.45
  Max Drawdown:  -19.37%
  Calmar:        0.91
  VaR (95%):     -1.62%
  CVaR (95%):    -2.48%

Compare this to where we started: 48% return with a 42% drawdown. Now we’re at 17.6% with a 19.4% drawdown. Less exciting, but the Calmar ratio went from 1.27 to 0.91 — actually a bit worse in this simulated example, which is honest. The real benefit shows up out-of-sample, where the constrained portfolio degrades gracefully while the unconstrained one falls apart. I haven’t run a rigorous out-of-sample test on this particular simulation, so take the specific numbers with a grain of salt.

What Risk Management Can’t Do

Why does any of this matter if you can’t predict returns in the first place? That’s the right question, and the answer is uncomfortable: risk management doesn’t make bad strategies good. It makes decent strategies survivable. The maximum drawdown constraint, the position limits, the shrinkage estimation — none of this generates alpha. It just keeps you in the game long enough for your edge (if you have one) to play out.

And there’s a class of risk that no amount of portfolio math can handle: liquidity risk, counterparty risk, the risk that your broker goes down during a crash, the risk that correlations spike to 1.0 precisely when diversification matters most. That last one — correlation breakdown during crises — is particularly nasty. The entire premise of diversification is that assets don’t all move together, but during a genuine market panic, they often do. The 2008 financial crisis showed this clearly: asset classes that had correlations of 0.3 in normal times suddenly exhibited correlations above 0.8.

I’m not entirely sure there’s a good quantitative solution to tail dependence in portfolio optimization. Copula-based approaches exist (the Gaussian copula famously failed in 2008), and some people use stress testing with historical crisis scenarios, but it always feels like you’re fighting the last war.

For practical purposes: use constrained minimum variance with Ledoit-Wolf shrinkage as your default. It’s not fancy, but it’s robust. Add VaR/CVaR monitoring as a circuit breaker — if your portfolio’s realized CVaR exceeds 2x the historical estimate, reduce positions. If you want to get fancier, look into risk parity (Bridgewater’s All Weather approach), where you equalize each asset’s risk contribution rather than its dollar weight. The riskfolio-lib Python package implements this and a dozen other optimization approaches, and it’s saved me from reimplementing a lot of this from scratch.

In Part 6, we’ll bring machine learning into the mix for return prediction — which is where the inputs to these optimization models actually come from. But keep in mind: a great ML model feeding into a naive equal-weight portfolio will often underperform a mediocre model feeding into a well-optimized, risk-managed portfolio. The plumbing matters more than most people think.

Quant Investment with Python Series (5/8)

← Previous: Backtesting Frameworks: Building Your First Trading Strategy Next: Machine Learning Models for Stock Price Prediction: Why Most Fail and What Actually Works →

Did you find this helpful?

☕ Buy me a coffee

Risk Management and Portfolio Optimization Techniques in Python

The Portfolio That Looked Great Until It Didn’t

Measuring Risk Beyond Standard Deviation

Value at Risk: Useful but Dangerously Overconfident

Mean-Variance Optimization: Where Theory Meets Reality’s Sharp Edges

Taming the Optimizer: Constraints and Shrinkage

Position Sizing: The Kelly Criterion and Why Full Kelly Is Insane

Putting It Together: A Risk-Managed Portfolio Pipeline

What Risk Management Can’t Do

Comments

Leave a Reply Cancel reply

Risk Management and Portfolio Optimization Techniques in Python

The Portfolio That Looked Great Until It Didn’t

Measuring Risk Beyond Standard Deviation

Value at Risk: Useful but Dangerously Overconfident

Mean-Variance Optimization: Where Theory Meets Reality’s Sharp Edges

Taming the Optimizer: Constraints and Shrinkage

Position Sizing: The Kelly Criterion and Why Full Kelly Is Insane

Putting It Together: A Risk-Managed Portfolio Pipeline

What Risk Management Can’t Do

Related Posts

Comments

Leave a Reply Cancel reply