6 minute read

PREDICTING CREDIT CARD CUSTOMER BEHAVIOR USING MACHINE LEARNINGBASED MODELING

Banks today have vast databases that can be mined to analyze performance and even predict customer behavior. A Zayed University-led research team has developed a model that leverages machine learning and credit card customer data to predict more accurately how quickly a customer will repay the amount on their credit card.

Managing risk is crucial to banks. They must quickly identify which customers are a bad risk and need to be cut off from taking on more debt, and which customers can continue to be granted credit. As credit cards represent a major proportion of bank business – and one against which no assets are secured – they present a liability that the bank must manage by careful identification of risk as either good or bad.

“Banks need to accurately and quickly predict consumer credit card default, and the best way to do that is to use advanced systems that automatically score customer behavior on credit card repayments. We believe behavioral scoring models that leverage the power of machine learning can better enable banks to make risk decisions and financial security decisions to reduce their losses,” explained Dr. Maher Ala’raj, Assistant Professor in the Department of Information Systems in the College of Technological Innovation (CTI) at Zayed University (ZU).

The team, which also comprised Dr. Maysam Abbod, Reader at the Department of Electronic and Computer Engineering at Brunel University London, and Dr. Munir Majdalawieh, an Associate Professor at the Department of Information Systems at the CTI ZU, developed a model for credit card customer repayment probability using behavioral scoring.

Behavioral scoring is a common method of analysis to predict a customer’s likelihood to default during a specific period. The strength of the prediction is often impaired by its subjective parameter choices, like outcome period and performance period, while the vast volume of credit card transaction data makes it difficult to apply traditional mathematical and statistical models for behavioral scoring.

“To construct behavioral scoring models, professionals must think about a few significant issues, such as the extensiveness of the dataset to model, the planning horizon, and drivers of unwanted behavior. The literature does not contain solid suggestions on the most proficient method to respond to these questions,” the researchers wrote in their paper on the topic that was recently published in the Journal of Big Data.

BY CONDUCTING A DETAILED COMPARISON PROCEDURE, WE HAVE PROVEN THAT MACHINE LEARNING MODELS SUCH AS LSTM CAN PROVIDE THE HIGHEST ACCURACY IN PREDICTING LATE FEES AND MISSED PAYMENTS.

Dr. Maher Ala’raj, Assistant Professor in the Department of Information Systems in the College of Technological Innovation

They developed a Long-Short-Term Memory (LSTM) model to enable automated credit card behavior scoring for bank customers. An LSTM is an artificial recurrent neural network architecture used in machine and deep learning to classify, process, and make predictions based on time series data. The team’s proposed LSTM model framework consisted of several steps, starting with pre-processing and formatting the dataset using a bidirectional LSTM classifier, which looks at a particular data sequence both from front to back and from back to front. Next, a fivefold validation technique was applied to get a prediction for all customers in the dataset. Then the performance measures were calculated for different groups of customers of financial interest to the banks, like those with an unsatisfactory repayment history.

To test the accuracy of their proposed bidirectional LSTM system, a public nontransactional credit cards dataset was analyzed, after which the results were benchmarked against five standard prediction classifier models. The dataset was of Taiwanese credit card transactions and included 30,000 records, with 23,364 non-default payments and 6,636 default payments. The classifier models included Gradient Boosting, Bagging Neural Network, Support Vector Machines, Random Forest, and Logic Regression.

Left to right: Dr. Maher Ala’raj and Dr. Munir Majdalawieh

Left to right: Dr. Maher Ala’raj and Dr. Munir Majdalawieh

The proposed bidirectional LSTM model and the five classifier models were then used to analyze the Taiwanese dataset, looking at four subsets of users to determine the models’ sensitivity, specificity, accuracy, balanced accuracy, and Brier score. Sensitivity refers to the ability of the model to identify missed payments.

Specificity measures the proportion of missed payments that are correctly identified. Accuracy is the simplest method of evaluating the model’s preciseness. Balanced accuracy is a metric used to assess the quality of the model when the dataset classes are imbalanced. The Brier score reflects the discriminatory power of the model, or its certainty in predicting a customer’s missed payment, with the lower the Brier score, the better the model’s performance.

The results showed that the team’s proposed model was the most sensitive at 37.51%. It was the fourth most specific, with specificity at 95.15%, however, the three classifier models with higher specificity had much lower sensitivity. The proposed model also had the third-highest accuracy. Overall, the team’s proposed bidirectional LSTM had the highest balanced accuracy percentage and the lowest Brier score, proving the overall quality of the model.

BEHAVIORAL SCORING IS A COMMON METHOD OF ANALYSIS TO PREDICT A CUSTOMER’S LIKELIHOOD TO DEFAULT DURING A SPECIFIC PERIOD

“Our research emphasizes the importance of credit card scoring for assessing and decreasing bank losses. By conducting a detailed comparison procedure, we have proven that machine learning models such as LSTM can provide the highest accuracy in predicting late fees and missed payments,” Dr. Ala’raj said.

Given the volume of credit card customers and the potential risk they pose to banks from lost repayments, the research team asserted that the modest accuracy gain of their classifier model could lead to major savings for financial institutions.

They assert that banks could use their novel classifier model not only for its binary output to determine whether a customer will miss a payment in the next month, but also to score each client. The scores provided by the model can be used to group customers into appropriate risk groups, so the bank can offer the corresponding service and security depending on their risk. This can enable banks to efficiently assess financial risks and make financial decisions.

“Our results show that, compared with benchmark models, the LSTM neural network has significantly improved consumer credit scoring. It is up to the bank to set up the thresholds above which they would move a customer into the high- or medium-risk group with corresponding consequences to the customer, such as decreasing their credit card limit or blocking their card. Moreover, such scores can be used as missing payment probabilities, so bank management can calculate the potential losses from each customer and even whole credit portfolios,” Dr Ala’raj said.

Going forward, the team will be looking to validate their bidirectional LSTM model on other real-world banking datasets, particularly from the UAE. They hope to further prove the efficiency of their model and its ability to analyze different behaviors from credit card customers. They will also be working to extend their model to customer credit scoring for different types of loan products.

Title of published paper: Modelling customers credit card behavior using bidirectional LSTM neural networks

Published in: Journal of Big Data

Journal metrics: Impact Factor: 11.09, Q1, H-index: 35, Scientific Journal Ranking (SJR): 1.03

Project funded by: Zayed University Office of Research, Grant Number R20053