Modified Isolation Forest Algorithm for Credit Card Fraud Detection

Krizzia Ydel M. Merino , Ma. Alexandra M. Ong , Raymund M. Dioses , Vivien A. Agustin , Ariel Antwaun Rolando C. Sison

TTACA. 2025 March; 4(1): 26-29. Published online 2025 March

doi.org/10.36647/TTACA/04.01.A001

Abstract : The Isolation Forest algorithm is an isolation-based method used for detection of anomaly. The algorithm has a problem with swamping which is the misclassification of the normal data points as anomalies. The said problem of Isolation Forest reduces its accuracy and effectiveness. The Modified Isolation Forest used an undersampling method called Near Miss method to address the problem of the Isolation Forest regarding false positives or swamping. The algorithm results in misclassification if a large imbalance dataset is used. Hence, incorporating Near Miss undersampling method to obtain a balanced dataset helps reduce the false positives and improves the overall performance of the algorithm. A dataset containing transactions of European cardholders is used in this study, which has 492 fraudulent transactions among 284, 807 transactions. Both the original algorithm and the modified algorithm are tested for anomaly detection using the same dataset. The original algorithm resulted in 158 True Positive (TP), 235619 True Negative (TN), 48696 False Positive (FP), and 334 False Negative (FN). While, the modified algorithm resulted in 395 True Positive (TP), 391 True Negative (TN), 101 False Positive (FP), and 97 False Negative (FN). The modified algorithm of Isolation Forest results in a significantly better performance compared to the original algorithm. With an accuracy rate of 0.79878 or 79.88%, a precision of 0.79637 or 79.64%, a recall of 0.80285 or 80.29% and an f1-score of 0.79960 or 0.80, the Modified Isolation Forest algorithm addressed the issue of false positives or swamping.

Keyword : Anomaly, Fraud Detection, Isolation Forest, NearMiss, Tree.

Recent Article