January 24, 2026

Unintended Memorization of Sensitive Information in Fine-Tuned Language Models

Authors: Marton Szep, Jorge Marin Ruiz, Georgios Kaissis, Paulina Seidl, Rüdiger von Eisenhart-Rothe, Florian Hinterwimmer, Daniel Rueckert (Technical University of Munich, Imperial College London)
arXiv: 2601.17480v1
Published: January 24, 2026
Keywords: PII memorization, LLM privacy, differential privacy, machine unlearning, fine-tuning

This paper investigates a critical privacy vulnerability: LLMs can memorize and leak personally identifiable information (PII) that appears only in training inputs, not in the training targets. Even when PII is irrelevant to the downstream task, fine-tuned models can be tricked into revealing names, addresses, and other sensitive data.

Deep Learning Law

More posts

February 5, 2026

January 24, 2026