Credit scores—a 20th century solution to a 21st century problem
Your credit score is one of the most important metrics in your financial life. Credit scores like FICO exist to allow banks to avoid bad loans. Yet, they are remarkably poor at predicting loan risk. Worse, they often penalize the best risks and create a perverse incentive to take out too many loans. Here, we look at how AI can produce far more accurate loan risk scores, allowing financial institutions to make genuinely informed choices about who to lend their money to.
Almost every country in the world has some form of credit scoring. Banks use credit scores to determine the risk in lending to a given individual. In the US, the primary credit score is FICO, in Germany, it is SCHUFA, while in the UK there are several credit reference agencies competing to offer scores.
Most of these scores work in a similar fashion. The company keeps a history of every loan and mortgage you have taken out, every credit or store card you own, and every credit application you have made. They then calculate a score based on a weighting of these factors. To see how this works, let’s take FICO as an example.
The FICO score
FICO scores were invented in the 1950s. They take a weighted average of 5 different factors which results in a numerical creditworthiness score.
Payment history (35%): This covers your history of paying back what you owe. Things that negatively affect this element include late payments, loans that have been written off, bankruptcies, repossessions, foreclosures, and judgments.
Debt burden (30%): This looks at how much you owe and to how many people. FICO uses 6 different metrics to make up this part of the score. These include the total amount owed, the number of accounts with outstanding balances, the credit utilization ratio, and how close you are to using all your available credit.
Length of credit history (15%): FICO looks at two metrics here. The age of your oldest credit account and the average age of all your accounts.
Credit mix (10%): This takes into account all the different forms of credit you hold. These include credit cards, retail accounts, installment loans, finance company accounts, and mortgage loans.
New credit (10%): Your recent history of credit searches can materially affect your score. Only hard pulls have a direct impact. These happen when a company does a formal credit search on you to decide whether to lend you money.
Your final FICO score is a number between 300 and 850. If you score 850 you have a perfect score, but anything under 580 is considered poor.
The problems with credit scores
There are a number of real problems with credit scores as a measure of creditworthiness. The 2008 crash threw these problems into sharp focus. And the ongoing pandemic once again highlights why your FICO score isn’t a good measure of your risk of defaulting.
The first real problem with a credit score is that it creates a perverse incentive to take out unnecessary credit. If you want to get a mortgage, you need to have a good credit score. But you can only achieve this by taking out numerous forms of credit. Often, people end up taking out credit cards, store cards, and short term bank loans simply to build up their credit score.
Gaming the system
Related to these perverse incentives are ways that consumers game the system. Numerous people have written about how you can improve your score and get a “perfect 850”. You achieve this by applying for new or increased credit, spending the right amount, and monitoring your score constantly.
Failure to measure the correct risk
Credit scores are based on the view that someone who has many sources of credit that they service regularly has proved they are at low risk of defaulting. By contrast, someone in a well-paid and secure job who has had no need of credit is automatically discriminated against. In effect, credit scores say only people that have needed to borrow money are good risks to lend to.
Lack of macroeconomic view
The final issue is that credit scores take no account of macroeconomics. During any downturn, previously good credit risks end up defaulting. In turn, these defaults play into their future credit score, even during subsequent economic booms.
Artificial intelligence loan scoring
Credit scores are a twentieth-century solution to measuring creditworthiness. If you need to calculate scores manually, then you are forced to use a simple metric like FICO. But we live in the twenty-first-century where big data is king and we have access to enormous computing power. So, we should turn to machine learning and build AI models to measure the actual risk of bad loans.
Here at Sonasoft, we use NuGene, our universal AI platform, to create bots that forecast loan defaults. NuGene takes the history of millions of loans along with macroeconomic factors, such as the interest rate and unemployment level. NuGene is able to take these factors and create a complex model that predicts which of the loans defaulted. It does this using a combination of ML approaches. Unsupervised learning allows it to look for unexpected correlations and patterns in the data. These correlations are then tested for causation. This data is then combined with supervised learning using the history of which loans defaulted. This allows NuGene to create and test the loan default model. Finally, the complete model allows you to forecast whether a new loan will default.
Problems with AI bias
Many other companies have tried to use AI to do credit scoring. Frequently, these systems have hit significant problems. Infamously, in 2019 the Apple-branded credit card from Goldmann Sachs received poor PR for the bias in their credit scores. The problem was, their AI models were biased. As a result, women were being offered lower credit limits than their partners, even when they had completely joint accounts.
AI bias is not a new problem. We have seen cases where facial recognition incorrectly identified 28 members of congress as criminals. The problem was, people of color were far more likely to be incorrectly matched. This is due to the inherent bias in the data used to train facial recognition algorithms.
Why Sonasoft NuGene is different
We have designed NuGene to avoid bias in models. It does this in several ways. Firstly, it takes a variety of different data, both numerical and time series. This allows it to take account of every factor that might affect a loan. Secondly, data scientists often add unintentional bias when they preprocess and clean the data. You need to take these steps to make problems tractable for creating ML models by hand. But NuGene uses unsupervised learning to teach itself about the complete data set. Thirdly, when NuGene identifies any potential patterns, it tests these for causality. This allows it to check the relative impact of every factor on whether a loan defaults.
Sometimes, you know there is a potential bias in the source data. For instance, if many fewer women apply for loans, you know that the data will have a bias against women. NuGene can then use generative adversarial networks to create new data that reverses the bias. The eventual result is a model that is extremely accurate, and that avoids bias.
TRY IT FOR FREE
We are confident that our products will exceed your expectations. We want you to try it for free.