Download PDFOpen PDF in browserAn End-to-End Framework Towards Improving RAG (Retrieval-Augmented Generation) Based Application PerformanceEasyChair Preprint 1561419 pages•Date: December 20, 2024AbstractRetrieval Augmented Generation (RAG) is a framework designed to address the limitation of Large Language Models (LLM) in terms of business domain awareness and knowledge cutoff. Hence, the adoption of RAG has been immense in recent times as it can overcome the above challenges. However, RAG also consists of several techniques and challenges. A few common challenges include being unable to produce optimal response or output format mismatch, missing to refer most important sources and incapable of retrieving the appropriate paragraphs or contexts. As a result, the response accuracy of RAG based applications deteriorates. Hence a recommendation system is the need of the hour which can assist the users to choose the most appropriate method based on the specific scenario. In this paper, an end-to-end RAG-based application performance improvement framework is proposed which will assist the users to select the optimal approach based on the present evaluation score and other constraints. Evaluation score is calculated based on the well-known metrices which include groundedness, answer relevance and context relevance. The framework is a collection of different techniques applied iteratively at different stages of RAG with the goal of improving the overall score. The embedding being the backbone of the RAG, a right fit embedding model recommendation is part of the overall framework. Keyphrases: Embedding model selection, Large Language Model (LLM), Retrieval Augmented Generation (RAG) improvement framework, Vector Database
|