返回目錄
A
Beyond the Numbers: A Modern Analyst’s Guide to AI‑Enhanced Finance - 第 5 章
Chapter 5 – Portfolio Construction & Optimization
發布於 2026-03-03 12:52
# Chapter 5 – Portfolio Construction & Optimization
In the previous chapters we have equipped you with the building blocks—data pipelines, feature engineering, and predictive models—that feed into the decision‑making core of modern finance. Portfolio construction is where these pieces converge: you turn noisy, model‑generated signals into a coherent allocation that balances expected return against risk, while respecting constraints such as ESG mandates or liquidity limits.
> **Why this chapter matters**: A portfolio is not just a bag of assets; it is a *risk‑adjusted performance engine*. The quality of your risk estimates and the fidelity of your objective function determine whether your strategy can survive regime shifts, survive market stress, and stay compliant with regulations.
---
## 5.1 Modern Portfolio Theory Revisited
| Concept | Definition | Why it matters in AI‑enhanced finance |
|---------|------------|--------------------------------------|
| Expected Return | The weighted average of asset expected returns | Provides the *objective* to maximize or trade‑off against risk |
| Covariance Matrix | Pairwise risk co‑variation between assets | Forms the *risk* term; critical for diversification |
| Efficient Frontier | Set of portfolios that offer the highest expected return for a given risk level | Baseline for comparing ML‑augmented portfolios |
### 5.1.1 From Classical to Data‑Driven
- **Classical MPT** assumes *normally distributed* returns and *constant* covariance. In practice, asset returns are fat‑tailed, and correlations shift with market regimes.
- **ML‑Enhanced MPT** replaces parametric assumptions with data‑driven risk estimates—e.g., volatility forecast models, regime‑switching covariance matrices, or risk embeddings from autoencoders.
> **Key takeaway**: Risk estimates are *just as important* as return forecasts. If your volatility forecast is off by 10%, the entire optimal allocation can collapse.
---
## 5.2 ML‑Generated Risk Estimates
### 5.2.1 Volatility Forecasting Models
| Model | Typical Input | Strength |
|-------|--------------|----------|
| GARCH(1,1) | Lagged returns | Captures volatility clustering |
| LSTM Auto‑Regressor | Historical price & macro series | Handles long‑term dependencies |
| Transformer Encoder | Multi‑asset price panel | Learns cross‑asset dynamics |
```python
import pandas as pd
from arch import arch_model
# Example: GARCH(1,1) for a single asset
returns = pd.Series(...)
model = arch_model(returns, vol='Garch', p=1, q=1)
res = model.fit(disp='off')
forecast = res.forecast(horizon=1).variance.iloc[-1].values[0]**0.5
print(f"Forecasted 1‑day vol: {forecast:.2%}")
```
### 5.2.2 Regime‑Switching Covariance Matrices
- **Hidden Markov Models (HMM)** can uncover latent market regimes (e.g., bull vs. bear). A separate covariance matrix is estimated for each regime.
- **Switching Kalman Filters** allow continuous regime drift.
```python
from hmmlearn import hmm
import numpy as np
# Suppose `log_returns` is NxM matrix of N days, M assets
model = hmm.GaussianHMM(n_components=2, covariance_type='full', n_iter=200)
model.fit(log_returns)
regimes = model.predict(log_returns)
# Compute covariance per regime
cov_matrices = [log_returns[regimes==k].cov() for k in range(2)]
```
> **Practical tip**: Store regime labels alongside risk forecasts; this enables *dynamic re‑balancing* that respects changing correlations.
---
## 5.3 Multi‑Objective Optimization
Portfolio construction rarely has a single objective. Common dimensions include:
1. **Expected return** (maximize)
2. **Risk** (minimize)
3. **Liquidity** (maximize or penalize thin assets)
4. **Transaction cost** (minimize)
5. **ESG score** (maximize)
### 5.3.1 Weighted Objective Formulation
Let `x` be the vector of portfolio weights. A generic multi‑objective problem can be expressed as:
\[
\min_{x} \, \lambda_1 \cdot \text{Risk}(x) - \lambda_2 \cdot \text{Return}(x) + \lambda_3 \cdot \text{Cost}(x) - \lambda_4 \cdot \text{ESG}(x)
\]
Subject to:
- \(\sum_i x_i = 1\)
- \(x_i \ge 0\) (long‑only) or \(-1 \le x_i \le 1\) (long/short)
- Custom constraints (e.g., sector caps)
### 5.3.2 Solvers
| Solver | Language | Strength |
|--------|----------|----------|
| **CVXPY** | Python | Handles convex problems, easy to prototype |
| **QuadProg** | R (`quadprog` package) | Fast for quadratic objectives |
| **Gurobi** / **CPLEX** | Python/R | Handles non‑convex with linear constraints |
```python
import cvxpy as cp
import numpy as np
# Parameters
mu = np.array([...]) # expected returns
Sigma = np.array([...]) # covariance matrix
cost = np.array([...]) # transaction cost vector
esg = np.array([...]) # ESG score vector
# Decision variable
x = cp.Variable(len(mu))
# Objective
objective = cp.Minimize(
0.5 * cp.quad_form(x, Sigma) # risk
- 1.0 * mu @ x # return
+ 0.01 * cost @ x # transaction cost
- 0.5 * esg @ x # ESG
)
# Constraints
constraints = [cp.sum(x) == 1, x >= 0]
prob = cp.Problem(objective, constraints)
prob.solve(solver=cp.OSQP)
print("Optimal weights:", x.value)
```
---
## 5.4 ESG Constraints in Practice
### 5.4.1 ESG Score Normalization
ESG scores often come from third‑party providers (MSCI, Sustainalytics). Normalizing them to a common scale (0–1) ensures comparability across sectors.
```python
import pandas as pd
esg_raw = pd.read_csv('esg_scores.csv')
esg_normalized = (esg_raw['score'] - esg_raw['score'].min()) / (
esg_raw['score'].max() - esg_raw['score'].min()
)
```
### 5.4.2 Penalty vs. Constraint
- **Penalty**: Add ESG to the objective function (see 5.3.2). Allows trade‑off with return.
- **Hard Constraint**: Enforce a minimum average ESG threshold.
```python
# Hard ESG constraint
min_esg = 0.7
constraints.append(esg_normalized @ x >= min_esg)
```
### 5.4.3 Sector‑Level Caps
You might cap the weight of a sector to avoid over‑concentration, especially if the sector has a lower ESG average.
```python
sector_weights = pd.Series(...)
max_sector = 0.15
for sector in sector_weights.unique():
idx = sector_weights[sector_weights == sector].index
constraints.append(cp.sum(x[idx]) <= max_sector)
```
---
## 5.5 Practical Implementation in Python
We walk through a minimal, reproducible example using `PyPortfolioOpt`.
```python
from pypfopt import EfficientFrontier, risk_models, expected_returns
import pandas as pd
# Load historical prices
prices = pd.read_csv('prices.csv', index_col=0, parse_dates=True)
# 1. Estimate expected returns (mean‑historical) and risk (Covariance)
mu = expected_returns.mean_historical_return(prices)
Sigma = risk_models.sample_cov(prices)
# 2. Build Efficient Frontier
ef = EfficientFrontier(mu, Sigma)
# 3. Add ESG penalty
esg = pd.read_csv('esg.csv', index_col=0)['esg_score']
# Normalize ESG to [0,1]
esg_norm = (esg - esg.min()) / (esg.max() - esg.min())
# Add ESG as a custom penalty term (scaled to risk units)
ef.add_penalty(lambda w: -0.5 * esg_norm.dot(w))
# 4. Optimize for max Sharpe ratio
raw_weights = ef.max_sharpe()
cleaned_weights = ef.clean_weights()
print(cleaned_weights)
```
**Output** (example):
```
{'AAPL': 0.15, 'MSFT': 0.12, 'JPM': 0.10, 'TSLA': 0.05, 'XOM': 0.08}
```
> **Tip**: `PyPortfolioOpt` internally uses `cvxpy`, so you can plug any custom penalty or constraint in the same way.
---
## 5.6 Practical Implementation in R
Below is a compact workflow using the `PortfolioAnalytics` package.
```r
library(PortfolioAnalytics)
library(ROI.plugin.quadprog)
# Load returns
prices <- read.csv('prices.csv', row.names = 1, check.names = FALSE)
ret <- na.omit(Return.calculate(prices))
# Portfolio specification
portf <- portfolio.spec(assets = colnames(ret))
portf <- add.constraint(portf, type = 'full_investment')
portf <- add.constraint(portf, type = 'long_only')
# Expected return and risk objectives
portf <- add.objective(portf, type = 'return', name = 'mean')
portf <- add.objective(portf, type = 'risk', name = 'StdDev')
# ESG constraint (hard)
esg <- read.csv('esg.csv', row.names = 1, check.names = FALSE)$esg_score
esg_norm <- (esg - min(esg)) / (max(esg) - min(esg))
portf <- add.constraint(portf, type = 'custom', name = 'ESG',
fun = function(x) { esg_norm %*% x >= 0.7 })
# Optimize for max Sharpe
opt <- optimize.portfolio(ret, portf, trace = TRUE,
optimize_method = 'ROI')
print(opt$weights)
```
---
## 5.7 Case Study: Energy Transition Portfolio
**Scenario**: An institutional investor wants to build a 1‑year horizon portfolio that maximizes returns while ensuring an average ESG score of at least 0.75 and limiting exposure to fossil‑fuel sectors.
| Step | Action |
|------|--------|
| 1 | Collect 3‑year monthly returns for 120 global equities |
| 2 | Train an LSTM to forecast 1‑month ahead returns |
| 3 | Estimate volatility with a GARCH(1,1) per asset |
| 4 | Cluster assets into *Renewable*, *Oil & Gas*, *Others* |
| 5 | Formulate a multi‑objective problem:
• Maximize expected return (from LSTM),
• Minimize risk (GARCH vol),
• Penalty for *Oil & Gas* sector weight > 0.05 |
| 6 | Solve with `cvxpy` in Python |
| 7 | Backtest using historical data; compare with benchmark MSCI World
|
**Result**: The ML‑driven portfolio outperformed the benchmark by 1.2 % annualized return while keeping volatility 3.5 % lower and maintaining an average ESG of 0.82.
---
## 5.8 Takeaways
1. **Risk is a moving target**: Leverage ML to produce *dynamic* volatility and correlation estimates that capture regime shifts.
2. **Multi‑objective optimization is essential**: Balancing return, risk, liquidity, cost, and ESG yields portfolios that meet business and regulatory goals.
3. **Integrate ESG at the objective or constraint level**: Use hard constraints for policy compliance or penalty terms for flexible trade‑offs.
4. **Python and R both offer mature ecosystems**: `PyPortfolioOpt`, `cvxpy`, and `PortfolioAnalytics` let you prototype quickly and scale up.
5. **Always validate**: Backtest with realistic transaction costs and slippage, and audit the optimization process for reproducibility.
> *“A portfolio built on sound risk estimates and disciplined optimization will outperform a purely data‑driven strategy that ignores the structure of financial markets.”*
---
*Next chapter: “Model Validation & Backtesting – Turning Signals into Profit.”*