ICRALM | Notion

In-Context Retrieval-Augmented Language Models

Authors: Ori Ram ; Yoav Levine ; Itay Dalmedigos ; Dor Muhlgay ; Amnon Shashua ; Kevin Leyton-Brown ; Yoav Shoham

Summary: Retrieval-Augmented Language Modeling (RALM) methods, which condition a language model (LM) on relevant documents from a grounding corpus during generation, were shown to significantly improve language modeling performance. In addition, they can mitigate the problem of factually inaccurate text generation and provide natural source attribution mechanism. Existing RALM approaches focus on modifying the LM architecture in order to facilitate the incorporation of external information, significantly complicating deployment. This paper considers a simple alternative, which we dub In-Context RALM: leaving the LM architecture unchanged and prepending grounding documents to the input, without any further training of the LM. We show that In-Context RALM that builds on off-the-shelf general purpose retrievers provides surprisingly large LM gains across model sizes and diverse corpora. We also demonstrate that the document retrieval and ranking mechanism can be specialized to the RALM setting to further boost performance. We conclude that In-Context RALM has considerable potential to increase the prevalence of LM grounding, particularly in settings where a pretrained LM must be used without modification or even via API access.

Q: 这篇论文试图解决什么问题？

A: 这篇论文提出了一种名为In-Context Retrieval-Augmented Language Modeling (In-Context RALM)的方法，旨在解决传统语言模型（LM）在生成文本时存在的几个关键问题：

外部知识获取限制：传统的语言模型在生成文本时，通常无法直接访问外部知识源，这限制了它们在需要最新或特定领域知识的场景中的有效性。
事实准确性问题：语言模型在生成文本时可能会产生事实上的错误或不准确的信息，尤其是在处理不常见领域或私有数据时。
缺乏来源归因：传统的语言模型生成的文本往往没有明确的信息来源，这使得用户难以验证文本的准确性和可靠性。

为了解决这些问题，In-Context RALM通过在语言模型生成过程中引入相关文档的检索和条件化，来增强模型的性能。这种方法不需要修改语言模型的架构，也不需要对模型进行额外的训练，从而简化了部署过程，并提高了模型在各种文本生成任务中的准确性和可靠性。

Q: 有哪些相关研究？

A: 相关研究主要分为两大类：最近邻语言模型（Nearest Neighbor Language Models，简称kNN-LM）和检索与阅读模型（Retrieve and Read Models）。以下是这两类模型的一些关键研究：

最近邻语言模型 (kNN-LM)：
- Khandelwal et al. (2020) 提出了一种简单的推理时模型，该模型在预测下一个词时，会在语言模型本身的概率分布和最近邻文档诱导的概率分布之间进行插值。
- Zhong et al. (2022) 提出了一种训练kNN-LM的框架，尽管他们展示了kNN-LM的显著性能提升，但这种方法需要为语料库中的每个词存储表示，即使是像Wikipedia这样的小型语料库，这也是一个昂贵的要求。
检索与阅读模型 (Retrieve and Read Models)：
- Lewis et al. (2020) 和 Izacard and Grave (2021) 对编码器-解码器架构进行了微调，以处理下游知识密集型任务。
- Izacard et al. (2022b) 探索了不同方式来预训练这类模型，而 Levine et al. (2022c) 在句子嵌入空间的最近邻簇上预训练了一个自回归语言模型。
- Levine et al. (2022a) 通过提示调整一个冻结的语言模型作为阅读器，展示了在开放领域问答任务中的竞争力。
- Guu et al. (2020) 预训练了REALM，这是一个检索增强的双向掩码语言模型，后来对其进行了微调以处理开放领域问答。
- Borgeaud et al. (2022) 提出了RETRO，这是一个修改过的自回归语言模型，通过分块交叉注意力机制关注相关文档，从而引入了新的模型参数。

这篇论文的工作与上述研究不同，因为它不涉及对语言模型的进一步训练，而是采用了一种简单的文档阅读机制，即在输入前缀之前直接附加选定的文档，这种方法被称为In-Context RALM。此外，论文还专注于如何选择合适的文档以提高语言模型的性能。

Q: 论文如何解决这个问题？

A: 论文提出了一种名为In-Context Retrieval-Augmented Language Modeling (In-Context RALM)的框架来解决上述问题。这个框架的核心思想是在不改变语言模型（LM）架构的前提下，通过在输入文本前附加相关文档来增强LM。具体来说，In-Context RALM的解决方案包括以下几个关键步骤：