NoteLLM | Notion

NoteLLM: A Retrievable Large Language Model for Note Recommendation

Authors: Chao Zhang ; Shiwei Wu ; Haoxin Zhang ; Tong Xu ; Yan Gao ; Yao Hu ; Enhong Chen

Summary: People enjoy sharing "notes" including their experiences within online communities. Therefore, recommending notes aligned with user interests has become a crucial task. Existing online methods only input notes into BERT-based models to generate note embeddings for assessing similarity. However, they may underutilize some important cues, e.g., hashtags or categories, which represent the key concepts of notes. Indeed, learning to generate hashtags/categories can potentially enhance note embeddings, both of which compress key note information into limited content. Besides, Large Language Models (LLMs) have significantly outperformed BERT in understanding natural languages. It is promising to introduce LLMs into note recommendation. In this paper, we propose a novel unified framework called NoteLLM, which leverages LLMs to address the item-to-item (I2I) note recommendation. Specifically, we utilize Note Compression Prompt to compress a note into a single special token, and further learn the potentially related notes' embeddings via a contrastive learning approach. Moreover, we use NoteLLM to summarize the note and generate the hashtag/category automatically through instruction tuning. Extensive validations on real scenarios demonstrate the effectiveness of our proposed method compared with the online baseline and show major improvements in the recommendation system of Xiaohongshu.

Untitled

Q: 这篇论文试图解决什么问题？

A: 这篇论文提出了一个名为NoteLLM的可检索大型语言模型（Large Language Model），旨在解决在线社区中与用户兴趣相匹配的笔记（notes）推荐问题。具体来说，它试图解决以下几个关键问题：

现有方法的局限性：现有的在线方法通常仅将笔记输入到基于BERT的模型中，以生成笔记嵌入（embeddings），并基于嵌入相似性进行推荐。这些方法可能没有充分利用一些重要线索，例如标签（hashtags）或类别（categories），这些线索代表了笔记的核心概念。
提升笔记嵌入的质量：通过学习生成标签/类别，可以潜在地增强笔记嵌入，因为这两个任务都涉及到将笔记的关键信息压缩到有限的内容中。
利用大型语言模型（LLMs）的优势：LLMs在理解自然语言方面显著优于BERT，但目前在笔记推荐中的应用还很少。论文提出将LLMs引入笔记推荐，以提高推荐系统的性能。
多任务学习框架：论文提出了一个统一的多任务框架，同时处理项目到项目（item-to-item, I2I）的笔记推荐任务和标签/类别生成任务，以增强笔记嵌入。
实际应用和效果验证：通过在真实场景中的广泛验证，展示了所提出方法与在线基线相比的有效性，并在小红书（Xiaohongshu）的推荐系统中取得了显著的改进。

总结来说，NoteLLM框架通过结合LLMs的强大语言理解能力和多任务学习策略，旨在提升笔记推荐系统的准确性和用户体验。

Q: 有哪些相关研究？

A: 在这篇论文中，作者提到了与NoteLLM相关的几个研究领域和具体工作，主要包括：

I2I推荐（Item-to-Item Recommendation）：
- 传统的I2I推荐依赖于用户行为的协同信号，但这些方法无法处理冷启动物品（cold-start items）。
- 内容基础的I2I推荐系统通过文本内容的相似性来衡量物品之间的关系。
- 使用BERT等深度学习模型将文本转换为嵌入向量，以衡量物品之间的相似性。
LLMs在推荐系统中的应用：
- LLMs在推荐任务中的应用，包括直接推荐、增强数据和作为编码器生成嵌入。
- 利用LLMs的丰富世界知识和生成能力来改进推荐系统。