Skip to content

ML Modeling 101: Principles & Practice

A series on the foundational ideas behind machine learning — how to think about problems, design experiments, and operate models in production. Each post focuses on a principle rather than a tool. Examples are drawn from credit risk and real-world classification and regression problems.


Series: ML Modeling 101

# Post What it covers
0 The Philosophy of Modeling: A Controlled Approximation Models don't need to be right — they need to be useful. Parsimony and three questions to answer before you start
1 Problem Formulation: Getting the Question Right Before Choosing a Model Translating a business problem into an ML problem. Target variable, unit of observation, horizon
2 Data Understanding: How Your Data Was Born Matters Data Generating Process, population vs sample, EDA as hypothesis testing
3 Feature Engineering: Where Domain Knowledge Becomes Signal Signal vs noise, transformation trade-offs, and leakage prevention — the most dangerous mistake in ML
4 Experimental Design: The Split Determines the Conclusion Train/val/test roles, no peeking, cross-validation, temporal split
5 Model Selection: No Free Lunch No Free Lunch Theorem, inductive bias, bias-variance tradeoff
6 Training & Optimization: Minimizing a Proxy of the Real Objective Loss functions, gradient descent, regularization as a prior belief
7 Model Evaluation: Measure What You Actually Need to Optimize Classification metrics, calibration, slice-based evaluation, metric hacking
8 Hyperparameter Tuning: Searching with a Strategy Grid search, random search, Bayesian optimization, overfitting hyperparameters
9 Interpretability & Explainability: Models Must Be Trusted SHAP, LIME, PDP/ICE, WoE scorecard, global vs local explanation
10 Productionization & Monitoring: Models Decay Over Time Training-serving skew, data drift vs concept drift, PSI, retraining strategy
11 The Modeling Mindset: Synthesis Iterative process, every decision is a hypothesis, domain knowledge beats algorithms

The caketool API Reference is available at API Reference.