Skip to content
Research · Apr 22, 2026

Apple researchers find vision-language models leak task-irrelevant information through logits

A new study systematically examines what information can be extracted from model internals, revealing that even constrained representations like top-k logits retain sensitive data not visible in standard outputs.

Trust62
HypeSome hype

1 source · cross-referenced

ShareXLinkedInEmail
TL;DR
  • Apple researchers published a study examining information leakage from vision-language model internals, comparing what data survives compression through different representational bottlenecks
  • The work demonstrates that top-k logits—typically considered less informative than full residual stream projections—can still leak task-irrelevant information from image queries
  • The research uses vision-language models as a testbed to systematically probe information retention across different model layers and compression methods

Apple Machine Learning Research has published a study examining how much information can be extracted from the internal representations of vision-language models, even when those representations appear constrained. The work, authored by Fedzechkina, Gualdoni, Ramos, and Williamson, systematically compares information retention across different compression levels within model architectures.

The researchers focused on two natural information bottlenecks: low-dimensional projections derived from the residual stream using tuned lens techniques, and the final top-k logits that typically influence model outputs. Their key finding is that top-k logits—simpler and more accessible than raw residual stream data—still retain task-irrelevant information from image-based queries, sometimes leaking as much sensitive data as direct projections of the full residual stream.

The study treats vision-language models as a testbed to understand the broader problem of unintentional or malicious information leakage. Model users and owners may assume that certain outputs or internal states are inaccessible or contain only task-relevant information, but this work suggests those assumptions merit scrutiny. Even bottlenecks designed to compress and filter information can become vectors for privacy violations.

Sources
  1. 01Apple — Machine Learning ResearchWhat Do Your Logits Know?
Also on Research

Stories may contain errors. Dispatch is assembled with AI assistance and curated by human editors; despite the trust-score filter, mistakes happen. We correct publicly — every article links to its revision history. Nothing here is financial, legal, or medical advice. Verify before relying on any claim.

© 2026 Dispatch. No ads. No sponsorships. No paid placement. Reader-supported via Ko-fi.

Built by a person who cares about honest AI news.