What machine learning can bring to credit risk management

Today’s credit risk management strategies are mostly based on the use of traditional methods. As credit markets continue to evolve, machine learning can help improve these processes

by | June 13, 2022 | CompatibL Technologies LLC

As credit markets continue to evolve, banks may take advantage of products which utilise machine learning – software which allows banks to anticipate risks more effectively. But should banks revise their credit risk management processes accordingly and employ these new solutions?

AI and machine learning for credit risk management

According to McKinsey, AI and machine learning technologies could add up to $1 trillion in additional value to global banking every year.

Financial institutions are using machine learning to make credit decisions more accurately and consistently while reducing risk, fraud, and costs. For example, Citi bank recently transformed its critical internal audit using machine learning—something that has contributed to high-quality credit decisions.

On the other hand, more complex and nuanced applications of these technologies have, until now, remained largely in the academic arena. Nowadays, though, quants and risk managers are bringing these technologies to real-world applications, paving the way to making their daily routines easier.

Artificial neural network model

Artificial neural networks are an effective tool for modelling and analysing complex systems. They have been used extensively in many scientific areas, such as pattern recognition, signal processing, forecasting and system control.

In recent years, the artificial neural network model for credit risk has attracted more and more attention from researchers due to the advantages bestowed by its non-linearity, parallel computing, high fault tolerance, and good generalization performance.

How does the artificial neural network model work?

Training the artificial neural network classifier requires the category label of the sample data to be known. This requires determining the actual credit rating of each company in the given year.

A new solution to this problem is the method of cluster analysis, where all enterprises are clustered into several categories. Thinking that the credit risk of all enterprises is normally distributed, the dimension is reduced by the factor analysis method, and the total factor score of each enterprise is obtained.

The actual credit risk grade of each category can then be determined according to the degree to which the total mean score of each category of factors deviates from the total mean score of the whole factor. After that, commonly used traditional credit risk prediction models are tested for accuracy.

Ultimately, the prediction effect is compared with the model generated by the artificial neural network model.

With its accuracy for predicting non-performing loans significantly improved, commercial banks can use the perceptron neural network model to make risk predictions for credit risk assessment, achieving good results.

Machine learning market generators

With pre-pandemic historical data no longer accurately representing current levels of risk, market generators’ ability to measure risk from a shorter time series is invaluable.

How do market generators work?

Risk models are calibrated to the historical data. The longer a model’s time horizon is, the longer is the time series required to calibrate the model.

With traditional risk models, the short length of pandemic-era time series data does not permit accurate model calibration. The time series for any given currency, stock, or credit name is too short to gain any statistical confidence in the estimate. Because market standard models for credit risk, limits, insurance reserves, and macro investing measure risk years ahead, they require a long time series that extends to pre-pandemic data that is no longer representative of the current level of risk.

Market generators are machine learning algorithms for generating additional samples of market data when historical time series are of insufficient length without relying on any preconceived notions about the data. They can generate the data for the time horizons of between 1 and 30 years that risk models require, making an accurate measurement of pandemic-era credit risk, limits, insurance reserves (economic scenario generation), and macro strategy performance possible.

Using unsupervised machine learning, market generators rigorously aggregate statistical data from multiple currencies, stocks, or credit names and then generate data samples for each name. This makes it possible to reduce the inherent statistical uncertainty of the short time series while preserving the differences between the names and incorporating them into the model.

This unique ability of the model was validated by comparing it with out-of-sample data—the gold standard of model validation.

Eliminating the risks of AI and machine learning 

According to McKinsey partner Derek Waldron, while artificial intelligence and advanced analytics offer significant opportunities for banks to capture, it must be done in a way in which risk management is also at the forefront of people’s minds. As in statistical modelling, it is important to focus on the following six areas when validating a machine learning model:

  • Interpretability
  • Bias
  • Feature engineering
  • Hyperparameter tuning
  • Production readiness
  • Dynamic model calibration

The risk of machine learning models being biased is real because the models can overfit the data if they are not treated properly. Overfitting is when a model appears to fit the data very well because it has been tuned in such a way as to replicate the data in a very effective way. In reality, it will not stand the test of time when the model goes into production and is exposed to situations it has not been exposed to before. Significant performance deterioration will be seen.

Another example is feature engineering. In statistical model development, a model developer would typically start with several hypotheses about features that drive the predictive performance of the model. Those features can be provided by subject matter expertise or domain expertise.

In artificial intelligence, the process is a bit different. The developer feeds a large amount of data into the AI algorithm and the model learns features that describe that data. The challenge in doing this is that the model can learn features that are quite counterintuitive, and, in some cases, the model can be overfitting the data. In this case, the model validator needs to be able to scrutinize the types of predictive variables that appear in the AI model and ensure they are consistent with intuition, and that they are, in fact, predictive of the output.

Ultimately, we believe machine learning will continue to play an important role in identifying patterns and trends that can help financial institutions thrive.