Requirements Traceability Link Recovery via Retrieval-Augmented Generation
[Context and Motivation] In software development, various interrelated artifacts are created. Access to information on the relation between these artifacts eases understanding of the system and enables tasks such as change impact and software reusability analyses. Manual trace link creation is labor-intensive and costly, and thus is often missing in projects. Automation could enhance the development and maintenance efficiency. [Question/Problem] Current methods for automatically recovering traceability links between different types of requirements do not achieve the necessary performance to be applied in practice, or require pre-existing links for machine learning. [Principal Ideas and Results] We propose to address this limitation by leveraging large language models (LLMs) with retrieval-augmented generation (RAG) for inter-requirements traceability link recovery. In an empirical evaluation on six benchmark datasets, we show that chain-of-thought prompting can be beneficial, open-source models perform comparably to proprietary ones, and that the approach can outperform state-ofthe-art and baseline approaches. [Contribution] This work presents an approach for inter-requirements traceability link recovery using RAG and provides the first empirical evidence of its performance. The performance improvements, however, may not be sufficient to fully automate inter-requirements traceability link recovery in practice.