Skip to content
Research · Apr 23, 2026

Apple presents parallel RNN training, improved state space models, and unified vision models at ICLR 2026

The company is showcasing five research contributions at the conference in Rio de Janeiro, including a framework for parallelized RNN training that achieves 665× speedup and enables competitive 7-billion-parameter language models.

Trust76
HypeLow hype

1 source · cross-referenced

ShareXLinkedInEmail
TL;DR
  • Apple will present research at ICLR 2026 on parallel RNN training (ParaRNN), improved state space models with tool-use access, unified image understanding and generation (Manzano), 3D scene generation from photos, and protein folding approaches.
  • ParaRNN paper, accepted as oral at ICLR, describes a framework achieving 665× speedup over sequential RNN training and enabling 7-billion-parameter classical RNNs with language modeling performance competitive with transformers.
  • State space model research shows SSMs fail on complex long-form generation due to bounded memory but can achieve length generalization when given external tool access for arithmetic, reasoning, and coding tasks.
  • Manzano, a unified multimodal model, uses a hybrid vision tokenizer with separate adapters for image understanding and generation to reduce performance trade-offs between the two capabilities.

Apple researchers are presenting five major research contributions at the Fourteenth International Conference on Learning Representations (ICLR 2026) in Rio de Janeiro this week, advancing work in efficient sequence modeling, multimodal vision-language systems, and 3D generation.

The centerpiece of Apple's presentation is ParaRNN, a framework for parallelizing training of recurrent neural networks. Historically, RNNs have been efficient for inference but difficult to scale due to the sequential nature of their computation. Apple's new approach achieves a 665× speedup over traditional sequential RNN training, making it feasible to train classical RNNs with up to 7 billion parameters. Benchmarking shows these large-scale RNNs achieve language modeling performance competitive with transformers of comparable size, potentially opening new architectural choices for practitioners building models under computational constraints. The ParaRNN codebase has been released as open source.

In parallel work on state space models (SSMs), Apple researchers identify and address a fundamental limitation: SSMs excel at long-context inference due to fixed-size memory and linear computational scaling, but this same constraint prevents them from solving complex problems that exceed the model's capacity, even with chain-of-thought generation. The research demonstrates that providing SSMs with external tool access—such as memory tools or code execution—enables them to generalize to arbitrary problem length and complexity on arithmetic, reasoning, and coding tasks.

Apple's Manzano model tackles a design trade-off in unified vision-language systems. Many existing multimodal models must choose between strong image understanding or strong generation. Manzano uses a hybrid tokenizer with separate lightweight adapters that feed a shared semantic space: one adapter produces continuous embeddings for understanding, the other produces discrete tokens for generation. A unified autoregressive language model predicts both text and image tokens, with an auxiliary diffusion decoder converting image tokens to pixels. This design achieves state-of-the-art results among unified models while remaining competitive with specialist systems, particularly on text-heavy evaluation.

Apple is also demonstrating local LLM inference on Apple silicon using the MLX framework and techniques for fast 3D scene synthesis from single images. The company is sponsoring affinity group events supporting underrepresented groups in machine learning research.

Sources
  1. 01Apple Machine Learning ResearchApple Machine Learning Research at ICLR 2026
Also on Research

Stories may contain errors. Dispatch is assembled with AI assistance and curated by human editors; despite the trust-score filter, mistakes happen. We correct publicly — every article links to its revision history. Nothing here is financial, legal, or medical advice. Verify before relying on any claim.

© 2026 Dispatch. No ads. No sponsorships. No paid placement. Reader-supported via Ko-fi.

Built by a person who cares about honest AI news.