https://papers.cool/arxiv/2307.07164
Authors: Liang Wang ; Nan Yang ; Furu Wei
Summary: Large language models (LLMs) have demonstrated their ability to learn in-context, allowing them to perform various tasks based on a few input-output examples. However, the effectiveness of in-context learning is heavily reliant on the quality of the selected examples. In this paper, we propose a novel framework to iteratively train dense retrievers that can identify high-quality in-context examples for LLMs. Our framework initially trains a reward model based on LLM feedback to evaluate the quality of candidate examples, followed by knowledge distillation to train a bi-encoder based dense retriever. Our experiments on a suite of $30$ tasks demonstrate that our framework significantly enhances in-context learning performance. Furthermore, we show the generalization ability of our framework to unseen tasks during training. An in-depth analysis reveals that our model improves performance by retrieving examples with similar patterns, and the gains are consistent across LLMs of varying sizes. The code and data are available at https://github.com/microsoft/LMOps/tree/main/llm_retriever .
A: 这篇论文试图解决的问题是如何为大型语言模型(LLMs)有效地检索高质量的上下文示例(in-context examples),以增强其在上下文学习(in-context learning, ICL)中的性能。在上下文学习中,LLMs 能够在给定少量输入-输出示例的情况下执行各种任务,而无需对模型参数进行更新或微调。然而,上下文学习的有效性在很大程度上依赖于所提供示例的质量。如果示例不能代表目标任务,模型可能无法有效学习。现有的方法要么关注相对较小的语言模型,要么未能以原则性的方式利用LLMs的细粒度反馈信息。因此,论文提出了一个新的框架 LLM-R(LLM Retriever),旨在为大型语言模型检索高质量的上下文示例。
A: 相关研究主要集中在以下几个领域:
这些研究为LLM-R框架提供了理论基础和方法论支持,特别是在如何利用LLMs的反馈来训练密集检索器以及如何通过迭代训练来提高检索质量方面。
A: 论文提出了一个名为LLM-R(LLM Retriever)的新颖框架来解决这个问题,该框架通过以下步骤迭代地训练密集检索器以识别高质量的上下文示例: