返回目錄
A
Data Science for Strategic Decision-Making: Turning Analytics into Business Value - 第 5 章
Chapter 5: Prescriptive Analytics & Optimization
發布於 2026-03-01 21:57
# Chapter 5: Prescriptive Analytics & Optimization
Prescriptive analytics takes the insights from predictive models one step further—**it tells you what to do**. By framing decisions as optimization problems, simulation experiments, or reinforcement‑learning policies, data scientists can recommend concrete actions that respect business constraints and maximize strategic objectives.
## 5.1 What is Prescriptive Analytics?
| Term | Definition | Typical Use‑Case |
|------|------------|-----------------|
| Prescriptive Analytics | The science of *suggesting* optimal actions based on data and constraints. | Optimizing inventory levels, fleet routing, dynamic pricing, staffing schedules. |
| Decision Model | A formal representation of the decision problem, including variables, constraints, and an objective. | Linear programming model for supply‑chain allocation. |
| Optimization | Finding the best solution (or a good approximation) to a decision model. | Solving a portfolio allocation that maximizes Sharpe ratio. |
| Simulation | Running a computational experiment to explore *what‑if* scenarios without solving a deterministic model. | Monte‑Carlo inventory simulation under demand uncertainty. |
| Reinforcement Learning (RL) | An ML framework where an agent learns a policy through trial‑and‑error interactions with an environment. | Dynamic pricing where the agent learns price‑elasticity over time. |
Prescriptive analytics sits at the intersection of **operations research**, **simulation science**, and **machine learning**. It requires clear business framing, robust data, and rigorous experimentation—exactly the principles we reinforced in Chapter 4.
## 5.2 Linear Programming (LP) and Integer Programming (IP)
### 5.2.1 Problem Formulation
An LP problem has the general form:
text
maximize cᵀx
subject to Ax ≤ b
x ≥ 0
* **Decision variables** `x` represent quantities to decide (e.g., units to produce, seats to allocate).
* **Objective vector** `c` captures the value or cost associated with each decision.
* **Constraint matrix** `A` and **right‑hand side** `b` encode capacity, budget, or policy limits.
#### Example: Airline Seat Allocation
| Variable | Meaning |
|----------|---------|
| `x₁` | Economy seats sold in Flight A |
| `x₂` | Business seats sold in Flight A |
| `x₃` | Economy seats sold in Flight B |
| `x₄` | Business seats sold in Flight B |
**Objective:** Maximize revenue
max 200x₁ + 400x₂ + 210x₃ + 420x₄
**Constraints:** Seat limits and minimum business seats
80x₁ + 100x₂ ≤ 200 (Flight A capacity)
90x₃ + 120x₄ ≤ 250 (Flight B capacity)
x₂, x₄ ≥ 0.10(x₁+x₂) , 0.10(x₃+x₄) (Business ≥10% of total)
This LP can be solved in seconds with solvers like **Gurobi**, **CPLEX**, or open‑source **PuLP**.
### 5.2.2 From LP to Mixed‑Integer Programming (MIP)
When decisions are discrete (e.g., assigning staff to shifts), **integer variables** are introduced:
xᵢ ∈ {0,1}
MIP is NP‑hard, but modern solvers handle thousands of binary variables for medium‑sized problems.
### 5.2.3 Practical Tips
| Tip | Why it matters |
|-----|----------------|
| Use **basis** to understand variable activity | Reveals which constraints are binding and why a solution changed after a data update. |
| Perform **sensitivity analysis** | Shows how objective changes with a parameter shift—critical for risk‑aware decision making. |
| Keep models **modular** | Separates business rules from data ingestion; easier to maintain as policies evolve. |
## 5.3 Simulation Techniques
Simulation lets you test *dynamic* and *uncertain* environments where closed‑form optimization is infeasible.
### 5.3.1 Monte‑Carlo Simulation
1. Define a probability distribution for uncertain inputs (e.g., demand).
2. Sample many scenarios.
3. Run the deterministic decision model for each scenario.
4. Aggregate results (mean, variance, percentiles).
**Tooling:** NumPy, Pandas, and `scipy.stats` for sampling.
python
import numpy as np
# Sample demand from a lognormal distribution
demand = np.random.lognormal(mean=4, sigma=0.5, size=10_000)
# Compute profits for a fixed order quantity q
q = 100
profit = np.maximum(0, (price * np.minimum(demand, q)) - (cost * q))
print('Expected profit:', profit.mean())
### 5.3.2 Discrete‑Event Simulation
Useful when the system has queues, service times, or stochastic events—e.g., call centers, manufacturing lines.
**Frameworks:** SimPy, Arena, AnyLogic.
python
import simpy
def customer(env, name, service_time):
yield env.timeout(service_time)
env = simpy.Environment()
for i in range(10):
env.process(customer(env, f'Customer {i}', np.random.exponential(5)))
env.run(until=100)
### 5.3.3 When to Use Simulation
| Scenario | Recommended Method |
|----------|--------------------|
| Static capacity planning | LP/IP |
| Demand uncertainty with simple decision rule | Monte‑Carlo |
| Complex queuing systems | Discrete‑Event |
| Adaptive policies that learn from feedback | RL |
## 5.4 Reinforcement Learning for Dynamic Decision‑Making
RL frames decision making as a **Markov Decision Process (MDP)**:
- **State** `s_t` captures the current environment (e.g., inventory level, time of day).
- **Action** `a_t` is the decision (e.g., price set, stock level).
- **Reward** `r_t` is the immediate profit or cost.
- **Policy** `π(a|s)` maps states to actions.
The goal is to learn a policy that maximizes the expected cumulative reward.
### 5.4.1 Classic RL Algorithms
| Algorithm | Type | Use‑Case |
|-----------|------|----------|
| Q‑Learning | Model‑free | Small discrete action spaces |
| SARSA | Model‑free | On‑policy control |
| DQN | Deep RL | High‑dimensional state space |
| PPO | Policy gradient | Stable training for large problems |
### 5.4.2 Practical Example: Dynamic Pricing
python
import gym
import stable_baselines3 as sb3
# Custom environment wrapper
class PricingEnv(gym.Env):
def __init__(self):
super().__init__()
self.price_range = [10, 20, 30, 40]
self.action_space = gym.spaces.Discrete(len(self.price_range))
self.observation_space = gym.spaces.Box(low=0, high=1, shape=(1,)) # e.g., time of day
def step(self, action):
price = self.price_range[action]
demand = np.random.poisson(lam=20 - price) # Simple inverse relationship
reward = price * demand
done = False
return np.array([0]), reward, done, {}
env = PricingEnv()
model = sb3.PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10_000)
### 5.4.3 Deployment Considerations
- **Reward shaping** must align with business KPIs.
- **Exploration vs. exploitation** trade‑offs can be tuned with ε‑greedy or entropy bonuses.
- **Model drift**: retrain periodically as demand patterns change.
- **Explainability**: augment RL with feature importance or rule extraction to satisfy governance.
## 5.5 Integrating Prescriptive Models into the Decision Cycle
| Phase | Prescriptive Activity | Tooling | Key Output |
|-------|-----------------------|---------|------------|
| Problem Definition | Map business constraints to variables | Stakeholder workshops, decision canvas | Decision matrix |
| Data Preparation | Forecast demand, estimate costs | Pandas, Prophet, sklearn | Feature set |
| Modeling | Build LP/IP, set up simulation, train RL | PuLP, SimPy, Stable Baselines | Optimal policies |
| Deployment | API, dashboard, automated pipelines | Flask, FastAPI, Airflow | Decision engine |
| Impact Measurement | Track KPIs vs. baseline | Tableau, Power BI | ROI table |
### 5.5.1 Experimentation Framework
Prescriptive analytics benefits from an **experiment‑oriented** mindset—just like Chapter 4:
1. **Hypothesis Canvas** – Frame the *why* of each decision.
2. **Design Document** – Capture constraints, data sources, evaluation metrics.
3. **A/B or Multi‑armed Bandit Tests** – Compare policy variants in production.
4. **Monitoring Dashboard** – Continuous tracking of objective function values.
## 5.6 Real‑World Case: Optimizing Distribution in a Global Retailer
| Stage | Approach | Result |
|-------|----------|--------|
| Problem | Reduce shipping costs while maintaining service levels. | Targeted cost savings. |
| Data | Historical orders, lead times, shipping rates. | Cleaned and imputed demand. |
| Model | Mixed‑integer program for warehouse‑to‑store routes. | 12% cost reduction. |
| Validation | Monte‑Carlo simulation of demand spikes. | 95% SLA maintained. |
| Deployment | REST API integrated into order‑routing system. | Real‑time routing decisions. |
| Impact | $3M annual savings; 15% reduction in carbon footprint. | Tangible strategic value. |
## 5.7 Practical Checklist for Prescriptive Analytics Projects
| Checklist Item | Why It Matters | Suggested Tool/Practice |
|-----------------|----------------|------------------------|
| Clear objective | Avoids “model chasing” | Define KPI, success metric upfront |
| Constraint validation | Ensures feasibility | Constraint sanity checks, solver diagnostics |
| Sensitivity analysis | Understand risk | `scipy.optimize.linprog` sensitivity, MIP solvers’ shadow prices |
| Robust data pipeline | Prevents stale inputs | Airflow DAGs, dbt models |
| Explainability layer | Builds trust | SHAP for LP, policy extraction for RL |
| Governance review | Meets regulatory compliance | Data lineage, audit logs |
## 5.8 Takeaways
1. **Prescriptive analytics turns data into action** by formally encoding business goals as mathematical programs.
2. **Linear programming** is fast and interpretable; **simulation** handles stochastic, dynamic systems; **reinforcement learning** offers adaptive, data‑driven policies for complex environments.
3. **Experimentation remains key**—even in optimization, you must validate assumptions, monitor performance, and iterate.
4. **Alignment between business, data, and model**—captured early in the hypothesis canvas—ensures that the prescribed actions deliver real, measurable value.
---
**Next Chapter**: *Natural Language & Unstructured Data for Competitive Intelligence*—where we’ll explore how text mining and sentiment analysis can feed into the decision models we just built.