LLM Security & Privacy

What? Papers and resources related to the security and privacy of LLMs.

Why? I am reading, skimming, and organizing these papers for my research in this nascent field anyway. So why not share it? I hope it helps anyone trying to look for quick references or getting into the game.

When? Updated whenever my willpower reaches a certain threshold (aka pretty frequent).

Where? GitHub and Notion. Notion is more up-to-date; I periodically transfer the updates to GitHub.

Who? Me and you (see Contribution below).

Overall Legend

Symbol	Description
⭐	I personally like this paper! (not a measure of any paper’s quality; see interpretation at the end)
💽	Dataset, benchmark, or framework
📍	Position paper
🔭	Survey paper
👁️	Vision-language models
💸	Experiment with closed-source models

Vulnerabilities

Prompt Injection

Ignore the previous instructions…

Ignore Previous Prompt: Attack Techniques For Language Models (2022) [Paper] ⭐ 💸
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection (2023) [Paper] ⭐ 💸
Demystifying RCE Vulnerabilities in LLM-Integrated Apps (2023) [Paper]
Prompt Packer: Deceiving LLMs through Compositional Instruction with Hidden Attacks (2023) [Paper] 💸
Prompt Injection attack against LLM-integrated Applications (2023) [Paper] 💸
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game (2023) [Paper] 💽 💸
Assessing Prompt Injection Risks in 200+ Custom GPTs (2023) [Paper] 💸
A Security Risk Taxonomy for Large Language Models (2023) [Paper] 🔭