Backfilling and Catchup: Reprocessing Historical Data
Every data pipeline will eventually need to reprocess historical data. A bug is discovered that corrupted two months of records. A business rule changes retroactively. A new model is added that needs to be populated from the beginning. The pipeline was down for three days and missed scheduled runs.
How you handle these situations — and whether your pipeline was designed to support them safely — determines whether reprocessing is a controlled operation or an emergency.