A fuzzy approach to identity resolution

Nawaz, Asif and Kazemian, Hassan (2021) A fuzzy approach to identity resolution. Proceedings of the International Neural Networks Society, 3 . Springer, Cham, Switzerland, pp. 307-318. ISBN 978-3-030-80568-5

A Fuzzy approach to identity resolution-Final Draft-27032021-EANN2021 final.pdf - Accepted Version

Download (457kB) | Preview
Official URL: https://link.springer.com/chapter/10.1007/978-3-03...

Abstract / Description

Identity resolution is crucial for law enforcement agencies globally and a difficult task to match the real-world identity in big data due to data inconsistency e.g. typographical errors, naming variation, and abbreviations. The fuzzy approach to identity resolution has been introduced that uses Soundex and Jaro-Winkler distance algorithms in a cascaded manner to calculate an aggregate score for the full name. While the Edit-distance algorithm is used to score the address and ethnicity description attributes. The Soundex code has been modified to numbers only with increased code length to 6-digits for this fuzzy approach. This allowed the matching algorithm to overcome some of the Soundex code limitations of name matching. The approach accommodates three different variations of name for an iterative search process that retrieves matched records based on inputs. In the experiment, searching for a suspect in two different cases, the initial search retrieved 173 and 52 records for each target suspect. These records were grouped using the Mean-Shift clustering technique based on the similarity score of three attributes. For further analysis, the segmentation process of records matched 16 and 22 records for each case respectively, and graph analysis matched the target suspect identity out of other matched identities with links association to different addresses. The overall matching performance of this fuzzy approach is encouraging, and it can benefit law enforcement agencies to speed up the investigation process and most importantly can help to identify the suspect with even minimal information available.

Item Type: Book
Additional Information: Proceedings of the 22nd Engineering Applications of Neural Networks Conference, pp 307–318
Uncontrolled Keywords: Fuzzy string matching, Identity resolution, Graph analysis, Soundex, Jaro-Winkler
Subjects: 000 Computer science, information & general works > 020 Library & information sciences
600 Technology
600 Technology > 620 Engineering & allied operations
Department: School of Computing and Digital Media
Depositing User: Hassan Kazemian
Date Deposited: 13 Dec 2022 09:47
Last Modified: 01 Jul 2023 01:58
URI: https://repository.londonmet.ac.uk/id/eprint/8080


Downloads per month over past year

Downloads each year

Actions (login required)

View Item View Item