A regional bank with $12 billion in assets wanted to use machine learning to improve its commercial loan underwriting process. The existing process was manual, relying on credit analysts who spent four to six hours per application evaluating financial statements, industry risk factors, and borrower history. The bank’s leadership believed that a machine learning model could reduce underwriting time, improve consistency, and surface risk factors that human analysts sometimes missed.
The problem was that commercial lending is one of the most heavily regulated activities in financial services. The bank’s regulator — the OCC — had issued guidance requiring that AI-based credit decisions be explainable, auditable, and free from prohibited discrimination. The bank’s compliance team had seen other institutions deploy AI models that passed internal review but failed regulatory examination because the institution could not explain how the model reached its decisions.
The CTO was direct about the constraint: “We will not deploy a model that we cannot explain to a regulator in plain language, with specific references to the input factors that drove a specific decision.” This was not a preference. It was a deployment gate.
The regulatory requirements
The bank faced three categories of regulatory constraint. First, model explainability. Regulation requires that when a borrower is denied credit or offered unfavorable terms, the bank must provide specific reasons. “The model decided” is not a valid reason. The bank must identify the input factors that most influenced the decision and communicate them in language that the borrower and the regulator can understand.
Second, fair lending compliance. The model must not produce outcomes that discriminate on prohibited bases — race, gender, national origin, or other protected characteristics. This applies not only to direct inputs but also to proxy variables. A model that uses zip code as a feature may produce disparate impact on borrowers from minority-majority neighborhoods, even if race is not explicitly included.
Third, model risk management. OCC guidance requires that institutions maintain a model inventory, document the development and validation process, and conduct ongoing monitoring for performance degradation and concept drift. The documentation must be sufficient for an independent party to reproduce the model’s behavior.
What the team tried first
The data science team’s first prototype was a gradient boosted tree model trained on five years of historical loan performance data. The model achieved an AUC of 0.87 on held-out test data, a meaningful improvement over the human analyst baseline of 0.74. The team was excited about the performance.
The compliance team rejected the model in pre-review. The model could not produce the specific reason codes required for adverse action notices. The team added SHAP values to provide feature importance for each prediction. The compliance team rejected this as well. SHAP values indicate which features contributed most to a prediction, but they do not explain whether the contribution was positive or negative in a way that maps to regulatory reason codes. A SHAP value for “revenue” tells you that revenue mattered. It does not tell you whether the borrower was denied because revenue was too low, too volatile, or too concentrated.
The fair lending analysis revealed a second problem. The model’s most important feature was a composite score that incorporated the borrower’s industry, geography, and business vintage. This composite was highly correlated with the racial composition of the borrower’s customer base. The model was not using race as an input, but it was using a feature that functioned as a proxy for race. The disparate impact analysis showed that the model’s denial rate for borrowers in majority-minority census tracts was 2.3 times the rate for borrowers in other tracts, even after controlling for credit quality.
The approach: constraint-first model design
We redesigned the model architecture to embed regulatory constraints into the model’s structure rather than bolting explainability and fairness analysis onto a black-box model after training.
This diagram requires JavaScript.
Enable JavaScript in your browser to use this feature.
The constraint filter was the first gate. Every feature that entered the model was screened for correlation with protected characteristics. Features with correlation above a threshold were either excluded or decomposed into sub-features that captured the credit-relevant signal without the discriminatory proxy. Geography was decomposed into economic indicators — median income, business density, infrastructure quality — rather than raw zip code.
The monotonic model was the second gate. Instead of a gradient boosted tree with unrestricted feature interactions, the team used a generalized additive model with monotonic constraints on key features. Monotonic constraints enforce a predictable relationship between a feature and the output: if revenue increases, the credit score must not decrease. This makes the model’s behavior predictable and explainable. A borrower can be told: “Your application was declined because annual revenue fell below the threshold for this loan product.” The reason is specific, accurate, and actionable.
The reason code generator was the third gate. For every decision, the system identified the top three features that drove the outcome and mapped them to regulatory reason codes. The mapping was deterministic — not a post-hoc explanation of an opaque model, but a direct readout of the constrained model’s decision logic.
What we gave up
The monotonic model achieved an AUC of 0.81, compared to 0.87 for the unconstrained gradient boosted tree. The six-point gap represented predictive power that the constraint model could not capture — specifically, non-linear interactions between features that were credit-relevant but could not be expressed as monotonic relationships.
The team accepted this gap after analyzing the business impact. The unconstrained model’s extra predictive power translated to approximately three additional approved loans per month that the constrained model would deny. The constrained model’s explainability and fairness properties, however, meant that the bank could actually deploy it. The unconstrained model was more accurate on paper and unusable in practice.
The second trade-off was development time. The constrained model took four months to develop and validate, compared to six weeks for the unconstrained prototype. The additional time was spent on feature screening, monotonic constraint design, disparate impact testing, and regulatory documentation.
Results
The model passed OCC examination on its first review. The examiner’s report noted that the bank’s model documentation was among the most thorough they had reviewed, and the reason code mapping was directly usable for adverse action notice generation. The bank was not required to make any changes to the model or its documentation after the examination.
Underwriting time for routine commercial loans dropped from four to six hours to under thirty minutes. The model handled the initial assessment and produced a draft decision with reason codes. A credit analyst reviewed the draft, validated the reason codes against the borrower’s file, and issued the final decision. The analyst’s role shifted from manual evaluation to model oversight — a higher-value activity that the analysts preferred.
Loan portfolio performance improved by four percent on risk-adjusted returns in the first year, measured against a matched cohort of loans underwritten during the same period by human analysts alone.
The decision heuristic
If your AI model cannot pass a regulatory examination as-is, do not build the model first and add compliance later. Build the compliance architecture first and fit the model inside it. The constrained model will be less accurate than the unconstrained alternative. That is the point. A model that is seven percent less accurate but deployable produces more business value than a model that is seven percent more accurate but cannot leave the lab. The regulatory constraint is not a limitation to work around. It is a design parameter that shapes the solution space.