Ana gezinime geç Aramaya geç Ana içeriğe geç

An Empirical Evaluation of Retrieval, Reranking, and Similarity for a Q&A-Based Retrieval Augmented Generation System

  • Harun Elkiran
  • , Jawad Rasheed*
  • *Bu çalışma için yazışmadan sorumlu yazar

Araştırma sonucu: Dergiye katkıMakalebilirkişi

Özet

Retrieval-Augmented Generation (RAG) has emerged as a fundamental paradigm for improving Large Language Models (LLMs) by incorporating external knowledge retrieval. RAG primarily aims to address the hallucination problem in LLMs that rely on extensive knowledge bases. A RAG system depends critically on design choices, including indexing strategies, retrieval methods, similarity metrics, and reranking models. The selection of configuration makes a RAG effective. Although the RAG system has received sufficient attention, there is very limited work on understanding the relative contributions of these components, and their statistical significance remains insufficiently understood. In this study, we conduct a comprehensive empirical evaluation of a modular RAG pipeline by systematically varying index structures, retrievers, rerankers, and similarity metrics. We evaluated performance using standard retrieval metrics such as Recall, Mean Reciprocal Rank, Normalized Discounted Cumulative Gain, and Coverage; generation-oriented quality metrics such as Correctness, Faithfulness, and Relevance; latency; and cost. Statistical robustness is ensured through ANOVA, effect size estimation, and multivariate regression analysis. Based on our results, the retriever and similarity metric choices dominate system performance, yielding statistically significant improvements with p -values less than 10-9 for retriever effects on R@1 and Coverage. At the same time, index selection exhibits a negligible impact across most metrics. Reranking primarily affects reranked metrics and downstream correctness, with MiniLM consistently outperforming BGE.

Orijinal dilİngilizce
Sayfa (başlangıç-bitiş)26053-26066
Sayfa sayısı14
DergiIEEE Access
Hacim14
DOI'lar
Yayın durumuYayınlandı - 2026
Harici olarak yayınlandıEvet

Bibliyografik not

Publisher Copyright:
© 2013 IEEE.

Parmak izi

An Empirical Evaluation of Retrieval, Reranking, and Similarity for a Q&A-Based Retrieval Augmented Generation System' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.

Alıntı Yap