Chapter 5 – Portfolio Construction & Optimization

發布於 2026-03-03 12:52

# Chapter 5 – Portfolio Construction & Optimization In the previous chapters we have equipped you with the building blocks—data pipelines, feature engineering, and predictive models—that feed into the decision‑making core of modern finance. Portfolio construction is where these pieces converge: you turn noisy, model‑generated signals into a coherent allocation that balances expected return against risk, while respecting constraints such as ESG mandates or liquidity limits. > **Why this chapter matters**: A portfolio is not just a bag of assets; it is a *risk‑adjusted performance engine*. The quality of your risk estimates and the fidelity of your objective function determine whether your strategy can survive regime shifts, survive market stress, and stay compliant with regulations. --- ## 5.1 Modern Portfolio Theory Revisited | Concept | Definition | Why it matters in AI‑enhanced finance | |---------|------------|--------------------------------------| | Expected Return | The weighted average of asset expected returns | Provides the *objective* to maximize or trade‑off against risk | | Covariance Matrix | Pairwise risk co‑variation between assets | Forms the *risk* term; critical for diversification | | Efficient Frontier | Set of portfolios that offer the highest expected return for a given risk level | Baseline for comparing ML‑augmented portfolios | ### 5.1.1 From Classical to Data‑Driven - **Classical MPT** assumes *normally distributed* returns and *constant* covariance. In practice, asset returns are fat‑tailed, and correlations shift with market regimes. - **ML‑Enhanced MPT** replaces parametric assumptions with data‑driven risk estimates—e.g., volatility forecast models, regime‑switching covariance matrices, or risk embeddings from autoencoders. > **Key takeaway**: Risk estimates are *just as important* as return forecasts. If your volatility forecast is off by 10%, the entire optimal allocation can collapse. --- ## 5.2 ML‑Generated Risk Estimates ### 5.2.1 Volatility Forecasting Models | Model | Typical Input | Strength | |-------|--------------|----------| | GARCH(1,1) | Lagged returns | Captures volatility clustering | | LSTM Auto‑Regressor | Historical price & macro series | Handles long‑term dependencies | | Transformer Encoder | Multi‑asset price panel | Learns cross‑asset dynamics | ```python import pandas as pd from arch import arch_model # Example: GARCH(1,1) for a single asset returns = pd.Series(...) model = arch_model(returns, vol='Garch', p=1, q=1) res = model.fit(disp='off') forecast = res.forecast(horizon=1).variance.iloc[-1].values[0]**0.5 print(f"Forecasted 1‑day vol: {forecast:.2%}") ``` ### 5.2.2 Regime‑Switching Covariance Matrices - **Hidden Markov Models (HMM)** can uncover latent market regimes (e.g., bull vs. bear). A separate covariance matrix is estimated for each regime. - **Switching Kalman Filters** allow continuous regime drift. ```python from hmmlearn import hmm import numpy as np # Suppose `log_returns` is NxM matrix of N days, M assets model = hmm.GaussianHMM(n_components=2, covariance_type='full', n_iter=200) model.fit(log_returns) regimes = model.predict(log_returns) # Compute covariance per regime cov_matrices = [log_returns[regimes==k].cov() for k in range(2)] ``` > **Practical tip**: Store regime labels alongside risk forecasts; this enables *dynamic re‑balancing* that respects changing correlations. --- ## 5.3 Multi‑Objective Optimization Portfolio construction rarely has a single objective. Common dimensions include: 1. **Expected return** (maximize) 2. **Risk** (minimize) 3. **Liquidity** (maximize or penalize thin assets) 4. **Transaction cost** (minimize) 5. **ESG score** (maximize) ### 5.3.1 Weighted Objective Formulation Let `x` be the vector of portfolio weights. A generic multi‑objective problem can be expressed as: \[ \min_{x} \, \lambda_1 \cdot \text{Risk}(x) - \lambda_2 \cdot \text{Return}(x) + \lambda_3 \cdot \text{Cost}(x) - \lambda_4 \cdot \text{ESG}(x) \] Subject to: - $\sum_i x_i = 1$ - $x_i \ge 0$ (long‑only) or $-1 \le x_i \le 1$ (long/short) - Custom constraints (e.g., sector caps) ### 5.3.2 Solvers | Solver | Language | Strength | |--------|----------|----------| | **CVXPY** | Python | Handles convex problems, easy to prototype | | **QuadProg** | R (`quadprog` package) | Fast for quadratic objectives | | **Gurobi** / **CPLEX** | Python/R | Handles non‑convex with linear constraints | ```python import cvxpy as cp import numpy as np # Parameters mu = np.array([...]) # expected returns Sigma = np.array([...]) # covariance matrix cost = np.array([...]) # transaction cost vector esg = np.array([...]) # ESG score vector # Decision variable x = cp.Variable(len(mu)) # Objective objective = cp.Minimize( 0.5 * cp.quad_form(x, Sigma) # risk - 1.0 * mu @ x # return + 0.01 * cost @ x # transaction cost - 0.5 * esg @ x # ESG ) # Constraints constraints = [cp.sum(x) == 1, x >= 0] prob = cp.Problem(objective, constraints) prob.solve(solver=cp.OSQP) print("Optimal weights:", x.value) ``` --- ## 5.4 ESG Constraints in Practice ### 5.4.1 ESG Score Normalization ESG scores often come from third‑party providers (MSCI, Sustainalytics). Normalizing them to a common scale (0–1) ensures comparability across sectors. ```python import pandas as pd esg_raw = pd.read_csv('esg_scores.csv') esg_normalized = (esg_raw['score'] - esg_raw['score'].min()) / ( esg_raw['score'].max() - esg_raw['score'].min() ) ``` ### 5.4.2 Penalty vs. Constraint - **Penalty**: Add ESG to the objective function (see 5.3.2). Allows trade‑off with return. - **Hard Constraint**: Enforce a minimum average ESG threshold. ```python # Hard ESG constraint min_esg = 0.7 constraints.append(esg_normalized @ x >= min_esg) ``` ### 5.4.3 Sector‑Level Caps You might cap the weight of a sector to avoid over‑concentration, especially if the sector has a lower ESG average. ```python sector_weights = pd.Series(...) max_sector = 0.15 for sector in sector_weights.unique(): idx = sector_weights[sector_weights == sector].index constraints.append(cp.sum(x[idx]) <= max_sector) ``` --- ## 5.5 Practical Implementation in Python We walk through a minimal, reproducible example using `PyPortfolioOpt`. ```python from pypfopt import EfficientFrontier, risk_models, expected_returns import pandas as pd # Load historical prices prices = pd.read_csv('prices.csv', index_col=0, parse_dates=True) # 1. Estimate expected returns (mean‑historical) and risk (Covariance) mu = expected_returns.mean_historical_return(prices) Sigma = risk_models.sample_cov(prices) # 2. Build Efficient Frontier ef = EfficientFrontier(mu, Sigma) # 3. Add ESG penalty esg = pd.read_csv('esg.csv', index_col=0)['esg_score'] # Normalize ESG to [0,1] esg_norm = (esg - esg.min()) / (esg.max() - esg.min()) # Add ESG as a custom penalty term (scaled to risk units) ef.add_penalty(lambda w: -0.5 * esg_norm.dot(w)) # 4. Optimize for max Sharpe ratio raw_weights = ef.max_sharpe() cleaned_weights = ef.clean_weights() print(cleaned_weights) ``` **Output** (example): ``` {'AAPL': 0.15, 'MSFT': 0.12, 'JPM': 0.10, 'TSLA': 0.05, 'XOM': 0.08} ``` > **Tip**: `PyPortfolioOpt` internally uses `cvxpy`, so you can plug any custom penalty or constraint in the same way. --- ## 5.6 Practical Implementation in R Below is a compact workflow using the `PortfolioAnalytics` package. ```r library(PortfolioAnalytics) library(ROI.plugin.quadprog) # Load returns prices <- read.csv('prices.csv', row.names = 1, check.names = FALSE) ret <- na.omit(Return.calculate(prices)) # Portfolio specification portf <- portfolio.spec(assets = colnames(ret)) portf <- add.constraint(portf, type = 'full_investment') portf <- add.constraint(portf, type = 'long_only') # Expected return and risk objectives portf <- add.objective(portf, type = 'return', name = 'mean') portf <- add.objective(portf, type = 'risk', name = 'StdDev') # ESG constraint (hard) esg <- read.csv('esg.csv', row.names = 1, check.names = FALSE)$esg_score esg_norm <- (esg - min(esg)) / (max(esg) - min(esg)) portf <- add.constraint(portf, type = 'custom', name = 'ESG', fun = function(x) { esg_norm %*% x >= 0.7 }) # Optimize for max Sharpe opt <- optimize.portfolio(ret, portf, trace = TRUE, optimize_method = 'ROI') print(opt$weights) ``` --- ## 5.7 Case Study: Energy Transition Portfolio **Scenario**: An institutional investor wants to build a 1‑year horizon portfolio that maximizes returns while ensuring an average ESG score of at least 0.75 and limiting exposure to fossil‑fuel sectors. | Step | Action | |------|--------| | 1 | Collect 3‑year monthly returns for 120 global equities | | 2 | Train an LSTM to forecast 1‑month ahead returns | | 3 | Estimate volatility with a GARCH(1,1) per asset | | 4 | Cluster assets into *Renewable*, *Oil & Gas*, *Others* | | 5 | Formulate a multi‑objective problem:    • Maximize expected return (from LSTM),    • Minimize risk (GARCH vol),    • Penalty for *Oil & Gas* sector weight > 0.05 | | 6 | Solve with `cvxpy` in Python | | 7 | Backtest using historical data; compare with benchmark MSCI World | **Result**: The ML‑driven portfolio outperformed the benchmark by 1.2 % annualized return while keeping volatility 3.5 % lower and maintaining an average ESG of 0.82. --- ## 5.8 Takeaways 1. **Risk is a moving target**: Leverage ML to produce *dynamic* volatility and correlation estimates that capture regime shifts. 2. **Multi‑objective optimization is essential**: Balancing return, risk, liquidity, cost, and ESG yields portfolios that meet business and regulatory goals. 3. **Integrate ESG at the objective or constraint level**: Use hard constraints for policy compliance or penalty terms for flexible trade‑offs. 4. **Python and R both offer mature ecosystems**: `PyPortfolioOpt`, `cvxpy`, and `PortfolioAnalytics` let you prototype quickly and scale up. 5. **Always validate**: Backtest with realistic transaction costs and slippage, and audit the optimization process for reproducibility. > *“A portfolio built on sound risk estimates and disciplined optimization will outperform a purely data‑driven strategy that ignores the structure of financial markets.”* --- *Next chapter: “Model Validation & Backtesting – Turning Signals into Profit.”*

Chapter 4: From Features to Models – Training, Validation, and Deployment in the Financial Arena

Chapter 6: Algorithmic Trading and Execution