ML Modeling 101: Principles & Practice

A series on the foundational ideas behind machine learning — how to think about problems, design experiments, and operate models in production. Each post focuses on a principle rather than a tool. Examples are drawn from credit risk and real-world classification and regression problems.

Series: ML Modeling 101

#	Post	What it covers
0	The Philosophy of Modeling: A Controlled Approximation	Models don't need to be right — they need to be useful. Parsimony and three questions to answer before you start
1	Problem Formulation: Getting the Question Right Before Choosing a Model	Translating a business problem into an ML problem. Target variable, unit of observation, horizon
2	Data Understanding: How Your Data Was Born Matters	Data Generating Process, population vs sample, EDA as hypothesis testing
3	Feature Engineering: Where Domain Knowledge Becomes Signal	Signal vs noise, transformation trade-offs, and leakage prevention — the most dangerous mistake in ML
4	Experimental Design: The Split Determines the Conclusion	Train/val/test roles, no peeking, cross-validation, temporal split
5	Model Selection: No Free Lunch	No Free Lunch Theorem, inductive bias, bias-variance tradeoff
6	Training & Optimization: Minimizing a Proxy of the Real Objective	Loss functions, gradient descent, regularization as a prior belief
7	Model Evaluation: Measure What You Actually Need to Optimize	Classification metrics, calibration, slice-based evaluation, metric hacking
8	Hyperparameter Tuning: Searching with a Strategy	Grid search, random search, Bayesian optimization, overfitting hyperparameters
9	Interpretability & Explainability: Models Must Be Trusted	SHAP, LIME, PDP/ICE, WoE scorecard, global vs local explanation
10	Productionization & Monitoring: Models Decay Over Time	Training-serving skew, data drift vs concept drift, PSI, retraining strategy
11	The Modeling Mindset: Synthesis	Iterative process, every decision is a hypothesis, domain knowledge beats algorithms

The caketool API Reference is available at API Reference.