🤖 Checklist to streamline AI Idea to Production

The Right Problem Framing
1. Verified there is quantifiable business value in solving the problem
2. Verified that simpler alternatives (such as hand-crafted heuristics) are not sufficient to address the problem.
3. Ensure that the problem has been decomposed into the smallest possible units.
4. Clarity on how the AI output will be applied to accomplish the desired business outcome.
5. Clear measurable metric(s) to measure the success of the solution
6. Clear understanding of precision versus recall tradeoff of the problem
7. Verified impact when the logistic classification prediction is incorrect
8. Ensure project costs includes the cost of managing corresponding data pipelines
The Right Dataset
9. Verified the meaning of the dataset attributes
10. Verified that the derived metrics used in the project are standardized
11. Verified data from the warehouse or lake is not stale due to data pipeline errors
12. Verify the schema compliance of the dataset
13. Verify the datasets comply with data rights regulations (such as GDPR, CCPA, etc)
14. Ensuring there is a clear change management process for dataset scheme changes
15. Verify dataset is not biased
16. Verify the datasets being used are not orphaned i.e., without data stewards
The Right Data Preparation
17. Verify data is IID (Independent and Identically Distributed)
18. Verify expired data is not used i.e., historic data values that may not be relevant
19. Verify there are no systematic errors in data collection
20. Verify dataset is monitored for sudden distribution changes
21. Verify seasonality in data (if applicable) is correctly taken into account
22. Verify data is randomized before splitting into training and test data
23. Verify there are no duplicates between test and training examples
24. Ensure sampled data is statistically representative of the dataset as a whole
25. Verify the correct use of normalization and standardization for scaling feature values
26. Verify outliers have been properly handled
27. Verify proper sampling for selecting samples from within a large dataset
The Right Design
28. Ensure feature crossing is experimented before jumping to non-linear models (as applicable)
29. Verify there is no feature leakage
30. Verify new features are added to the model with justification documented on how they increase the model quality
31. Verify features are correctly scaled
32. Verify simpler traditional ML models are tried before using deep learning
33. Ensure hashing is applied for sparse features (as applicable)
34. Verify model dimensionality reduction experimented
35. Verify classification threshold tuning (in logistic regression) takes into account business impact
36. Verify regularization or early stopping in logistic regression is applied (as applicable)
37. Apply embeddings to translate large sparse vectors into a lower-dimensional space (while preserving semantic relationships)
38. Verify model freshness requirements based on the problem requirements
39. Verify the impact of features that were discarded for applying only to a small fraction of data
40. Check if feature count is proportional to the amount of data available for model training
The Right Training
41. Ensure interpretability is not compromised prematurely for performance during early stages of model development
42. Verify model tuning is following a scientific approach (instead of ad-hoc)
43. Verify the learning rate is not too high
44. Verify root causes are analyzed and documented if the loss-epoch graph is not converging
45. Analyzed specificity versus sparsity trade-off on model accuracy
46. Verify reducing loss value does affect improving recall/precision
47. Define clear criteria for starting online experimentation i.e., canary deployment
48. Verify per-class accuracy in multi-class classification
49. Verify infrastructure capacity or cloud budget allocated for training
50. Ensure model permutations are verified using the same datasets (for an apples-to-apples comparison)
51. Verified model accuracy not just for the overall dataset but also for individual segments/cohorts
52. Verify the training results are reproducible i.e. snapshotting the code (algo), data, config, and parameter values
53. Verified there are no inconsistencies in training-serving skew for features
54. Verify feedback loops in model prediction have been analyzed
55. Verify there is a backup plan if the online experiment does not go as expected
56. Verify that the model has been calibrated
57. Leverage automated hyperparameter tuning (as applicable)
58. Verify prediction bias has been analyzed
59. Verify dataset analyzed for class imbalance
60. Verify model experimented with regularization lambda to balance simplicity and training-data fit.
61. Verify the same test samples are not being used over and over for test and validation
62. Verify batch size hyperparameter is not too small
63. Verify initiation values in neural networks
64. Verify the details of failed experiments are captured
65. Verify the impact of wrong labels before investing in fixing them
66. Verify a consistent set of metrics are used to analyze the results of online experiments
67. Verify multiple hyperparameters are not tuned at the same time
The Right Operational Readiness
68. Verify data pipelines for generating time-dependent features are performant for low latency
69. Verify validation tests exist for data pipelines
70. Verify model performance for the individual data slices
71. Avoid using two different programming languages between training and serving
72. Ensure appropriate model scaling such that inference threshold is within the threshold
73. Verify data quality inconsistencies are checked at source, ingestion into the lake, and ETL processing
74. Verify cloud spend associated with the AI product is within budget.
75. Ensure optimization phase to balance quality with model depth and width
76. Verify monitoring for data and concept drift
77. Verify unnecessary calibration layers have been removed
78. Verify there is monitoring to detect slow poisoning of the model due to intermittent errors
Original article by Sandeep Uttamchandani

The Right Problem Framing

The Right Dataset

The Right Data Preparation

The Right Design

The Right Training

The Right Operational Readiness