Advancing cyber incident timeline analysis through retrieval-augmented generation and large language models

Loumachi, Fatma Yasmine; Ghanem, Mohamed Chahine; Ferrag, Mohamed Amine

London Met Repository

Tools

Lists

Loumachi, Fatma Yasmine, Ghanem, Mohamed Chahine and Ferrag, Mohamed Amine (2025) Advancing cyber incident timeline analysis through retrieval-augmented generation and large language models. Computers, 14 (2) (67). pp. 1-42. ISSN 2073-431X

Abstract
Documents
Details
Record

[+][-]

Abstract

Cyber timeline analysis or forensic timeline analysis is critical in digital forensics and incident response (DFIR) investigations. It involves examining artefacts and events---particularly their timestamps and associated metadata---to detect anomalies, establish correlations, and reconstruct a detailed sequence of the incident. Traditional approaches rely on processing structured artefacts, such as logs and filesystem metadata, using multiple specialised tools for evidence identification, feature extraction, and timeline reconstruction.

This paper introduces an innovative framework, GenDFIR, a context-specific approach powered via large language model (LLM) capabilities. Specifically, it proposes the use of Llama 3.1 8B in zero-shot, selected for its ability to understand cyber threat nuances, integrated with a retrieval-augmented generation (RAG) agent.

Our approach comprises two main stages:

(1) Data preprocessing and structuring: incident events, represented as textual data, are transformed into a well-structured document, forming a comprehensive knowledge base of the incident.

(2) Context retrieval and semantic enrichment: a RAG agent retrieves relevant incident events from the knowledge base based on user prompts. The LLM processes the pertinent retrieved context, enabling a detailed interpretation and semantic enhancement. The proposed framework was tested on synthetic cyber incident events in a controlled environment, with results assessed using DFIR-tailored, context-specific metrics designed to evaluate the framework’s performance, reliability, and robustness, supported by human evaluation to validate the accuracy and reliability of the outcomes. Our findings demonstrate the practical power of LLMs in advancing the automation of cyber-incident timeline analysis, a subfield within DFIR. This research also highlights the potential of generative AI, particularly LLMs, and opens new possibilities for advanced threat detection and incident reconstruction.

Documents