Comparative study of data pre-processing techniques for enhancing fake review detection: a novel pipeline approach

Quyyam, Tayybaha and Yu, Qicheng (2025) Comparative study of data pre-processing techniques for enhancing fake review detection: a novel pipeline approach. In: 13th International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA-2025) June 06 - 07, 2025, 6-7 June 2025, London Metropolitan University, London (UK) / Online. (In Press)

Abstract

The rise of online reviews has significantly influenced consumer purchasing decisions, but it has also led to an increase in fraudulent reviews that can artificially boost or tarnish a business's reputation, particularly affecting small businesses. To combat this, we propose a novel pipeline for fake review detection that integrates text data, handcrafted features, and rating-category features to enhance robustness. Our pipeline includes innovative components such as a Context Aware Preprocessor, Fake Review Feature Adder, Category Rating Embedding, Tokenizer Padding, Adaptive Fusion Layer, and Trust Aware Berta Classifier. Applied to the Amazon dataset, our BERT model with the Adaptive Fusion Layer achieves an AUC-ROC score of 0.96, demonstrating its effectiveness in detecting fake reviews. This research underscores the potential of advanced NLP techniques to maintain the authenticity of online reviews, thereby protecting both businesses and consumers from the negative impacts of fraudulent reviews.

Documents
10377:52625
[thumbnail of Comparative Study of Data Pre-Processing Techniques for Enhancing Fake Review Detection.pdf]
Comparative Study of Data Pre-Processing Techniques for Enhancing Fake Review Detection.pdf - Accepted Version
Restricted to Repository staff only until 1 May 2026.

Download (409kB) | Request a copy
Details
Record
View Item View Item