National Cyber Warfare Foundation (NCWF)

Session 14C: Vulnerability Detection

Authors, Creators & Presenters: Jie Lin (University of Central Florida), David Mohaisen (University of Central Florida)

PAPER

From Large to Mammoth: A Comparative Evaluation of Large Language Models in Vulnerability Detection

Large Language Models (LLMs) have demonstrated strong potential in tasks such as code understanding and generation. This study evaluates several advanced LLMs--such as LLaMA-2, CodeLLaMA, LLaMA-3, Mistral, Mixtral, Gemma, CodeGemma, Phi-2, Phi-3, and GPT-4--for vulnerability detection, primarily in Java, with additional tests in C/C++ to assess generalization. We transition from basic positive sample detection to a more challenging task involving both positive and negative samples and evaluate the LLMs' ability to identify specific vulnerability types. Performance is analyzed using runtime and detection accuracy in zero-shot and few-shot settings with custom and generic metrics. Key insights include the strong performance of models like Gemma and LLaMA-2 in identifying vulnerabilities, though this success varies, with some configurations performing no better than random guessing. Performance also fluctuates significantly across programming languages and learning modes (zero- vs. few-shot). We further investigate the impact of model parameters, quantization methods, context window (CW) sizes, and architectural choices on vulnerability detection. While CW consistently enhances performance, benefits from other parameters, such as quantization, are more limited. Overall, our findings underscore the potential of LLMs in automated vulnerability detection, the complex interplay of model parameters, and the current limitations in varied scenarios and configurations.

ABOUT NDSS

The Network and Distributed System Security Symposium (NDSS) fosters information exchange among researchers and practitioners of network and distributed system security. The target audience includes those interested in practical aspects of network and distributed system security, with a focus on actual system design and implementation. A major goal is to encourage and enable the Internet community to apply, deploy, and advance the state of available security technologies.

Our thanks to the Network and Distributed System Security (NDSS) Symposium for publishing their Creators, Authors and Presenter’s superb NDSS Symposium 2025 Conference content on the Organizations' YouTube Channel.

Permalink

The post NDSS 2025 – A Comparative Evaluation Of Large Language Models In Vulnerability Detection appeared first on Security Boulevard.

Marc Handelman

Source: Security Boulevard
Source Link: https://securityboulevard.com/2026/03/ndss-2025-a-comparative-evaluation-of-large-language-models-in-vulnerability-detection/

NDSS 2025 – A Comparative Evaluation Of Large Language Models In Vulnerability Detection

Comments