Patel, Yogesh (2019) Cross channel fraud detection framework in financial services using recurrent neural networks. Doctoral thesis, London Metropolitan University.
The reliability and performance of real time fraud detection techniques has been a major concern for the financial institutions as traditional fraud detection models couldn’t cope with the emerging new and innovative attacks that deceive banks. The problems are further exacerbated with evolving customer behaviour as existing fraud detection models unable to cope with class imbalance problem and longer feedback loop. This thesis looks at the holistic view of fraud detection and proposes a conceptual fraud detection framework that can detect anomalous transaction quickly and accurately, as well as dynamically evolve to maintain the efficiency with minimum input from subject matter expert. The framework is used to analyse Internet Banking (IB) transactions and contextual information to reduce the false positives and improve fraud detection rates. Based on the proposed framework, Long Short-Term Memory (LSTM) based Recurrent Neural Network model for detecting fraud in remote banking is implemented and performance is evaluated against Support Vector Machine (SVM) and Markov models.
The main research element is to model events as state vectors so that sequence-based learning can be applied, followed by a weak classifier to deal with noise. Firstly, the study focuses on Feature Engineering where along raw attributes such as IP Address, Amount and other, two novel features for remote banking fraud are evaluated, i.e., the time spend on a page and the time between page transition. The second focus is on modelling which is performed on an anonymised real-life dataset, provided by a large financial institution in Europe. The results of the modelling demonstrate that given the labelled dataset all models can detect payment fraud with acceptable accuracy.
Various tests proved that the LSTM model achieves a F1 score of 97.7% whereas the SVM and Markov model achieve 93.5% and 95.0% respectively. As the time elapsed, the LSTM model performance significantly improves as the sequence of events became larger. As the dataset increases that time it takes to train traditional models becomes a bottleneck. This proves the hypothesis that the events across banking channels can be modelled as time series data and then sequence-based learners such as Recurrent Neural Network (RNN) can be applied to improve or reduce the False Positive Rate (FPR) and False Negative Rate (FNR).
Available under License Creative Commons Attribution Non-commercial No Derivatives 4.0.
Download (8MB) | Preview
View Item |