Skip to main navigation Skip to search Skip to main content

Evaluating retriever reranker pairings in RAG based on quality and efficiency trade-offs

  • Istanbul Sabahattin Zaim University
  • Istanbul Medipol University
  • Nisantasi Universitesi

Research output: Contribution to journalArticlepeer-review

Abstract

Large language models (LLMs) are the core of many Artificial Intelligence (AI) systems. One of the key problems with these systems is hallucination (i.e., making up facts). Retrieval-Augmented Generation (RAG) solves this problem by grounding responses in external knowledge sources, thereby improving the factual accuracy of the response. The RAG system consists of two core components: the information retrieval component (retriever and rerankers) and the text generation component (LLM). So the efficacy of a RAG system depends on the retrieval strategies, reranking mechanisms, and generation models. In this study, we conduct a systematic evaluation of 9 retriever–reranker configurations (3 retrievers (Fusion, HyDE, and HyPE), 3 rerankers (BGE, MiniLM, and GPT-4o-mini)) within a controlled RAG framework. Our analysis extends beyond traditional retrieval metrics by evaluating Mean Reciprocal Rank (MRR), generation correctness, faithfulness, relevance, cost, and latency. Results show that LLM-based reranking consistently improves downstream generation quality, with the HyPE + GPT-4o-mini configuration achieving the highest overall performance with correctness and relevance scores of 0.8012 and 0.9267, respectively, and the only positive MRR gain. While cross-encoder rerankers offer lower latency and cost, they exhibit a measurable decline in answer quality.

Original languageEnglish
Article number259
JournalDiscover Computing
Volume29
Issue number1
DOIs
Publication statusPublished - Dec 2026
Externally publishedYes

Bibliographical note

Publisher Copyright:
© The Author(s) 2026.

Keywords

  • Information retrieval
  • LLM
  • RAG
  • Reranking
  • Retriever

Fingerprint

Dive into the research topics of 'Evaluating retriever reranker pairings in RAG based on quality and efficiency trade-offs'. Together they form a unique fingerprint.

Cite this