Asian Tech Stocks Rebound as Global Funds Flow Back Into AI TradeChina Drafts Sweeping Rules for AI 'Digital Humans' to Protect MinorsDeepSeek V4 Set to Run on Huawei Chips as China Accelerates AI IndependenceAsia's AI Regulation Puzzle: How 16 Jurisdictions Are Taking 16 Different PathsThe CPU Renaissance: Why Traditional Chips Are Making an Unexpected AI ComebackThe DeepSeek V4 Test: Can China's AI Ambitions Survive Without Nvidia?Eclipse Ventures Raises $1.3 Billion to Build the 'Physical AI' Startup EcosystemHermeus Raises $350 Million to Build Unmanned Hypersonic AircraftAsian Tech Stocks Rebound as Global Funds Flow Back Into AI TradeChina Drafts Sweeping Rules for AI 'Digital Humans' to Protect MinorsDeepSeek V4 Set to Run on Huawei Chips as China Accelerates AI IndependenceAsia's AI Regulation Puzzle: How 16 Jurisdictions Are Taking 16 Different PathsThe CPU Renaissance: Why Traditional Chips Are Making an Unexpected AI ComebackThe DeepSeek V4 Test: Can China's AI Ambitions Survive Without Nvidia?Eclipse Ventures Raises $1.3 Billion to Build the 'Physical AI' Startup EcosystemHermeus Raises $350 Million to Build Unmanned Hypersonic AircraftAsian Tech Stocks Rebound as Global Funds Flow Back Into AI TradeChina Drafts Sweeping Rules for AI 'Digital Humans' to Protect MinorsDeepSeek V4 Set to Run on Huawei Chips as China Accelerates AI IndependenceAsia's AI Regulation Puzzle: How 16 Jurisdictions Are Taking 16 Different PathsThe CPU Renaissance: Why Traditional Chips Are Making an Unexpected AI ComebackThe DeepSeek V4 Test: Can China's AI Ambitions Survive Without Nvidia?Eclipse Ventures Raises $1.3 Billion to Build the 'Physical AI' Startup EcosystemHermeus Raises $350 Million to Build Unmanned Hypersonic Aircraft
MIT CSAIL researchers working on AI model compression visualization
MIT News
Research

MIT's CompreSSM Uses Control Theory to Compress AI Models During Training, Achieving 4x Speedups

A new technique from MIT CSAIL applies Hankel singular values to identify and remove unnecessary model components while training is still underway.

D
Daniel ParkAI Correspondent
4 min read

Researchers at MIT CSAIL, in collaboration with the Max Planck Institute, ETH Zurich, and Liquid AI, have developed a technique called CompreSSM that can compress AI models during training rather than after — a shift that could significantly reduce the time and compute required to produce efficient models.

How It Works

CompreSSM borrows a concept from control theory called Hankel singular values. In control systems engineering, these values measure how important each component of a dynamic system is to the overall input-output behavior. The MIT team applied this mathematical framework to AI models, using Hankel singular values to identify which components of a model are contributing meaningfully to its outputs — and which can be safely removed.

The key innovation is that this analysis happens during training, not after. Traditional model compression techniques train a full-sized model first and then prune or distill it into a smaller version. CompreSSM integrates compression into the training process itself, which means smaller, faster models can be produced without first investing the full compute budget of a large model.

Results

On the Mamba architecture (a state-space model alternative to transformers), CompreSSM achieved:

  • Approximately 4x training speedup
  • Compression from 128 dimensions to roughly 12 dimensions while maintaining functional equivalence
  • On image classification tasks: near-equivalent accuracy at 1.5x faster training
  • 40x faster than Hankel nuclear norm regularization, the previous best approach

Lead author Makram Chahine and co-author Daniela Rus (director of MIT CSAIL) published the findings on April 9, 2026.

Why It Matters

The cost of training frontier AI models has been rising exponentially, with the largest models now requiring compute budgets measured in hundreds of millions of dollars. Any technique that reduces training time translates directly into cost savings — and CompreSSM's 4x speedup, if it generalizes to larger models and different architectures, could meaningfully change the economics of AI development.

The approach is also significant because it applies to state-space models like Mamba, which are emerging as potential alternatives to the transformer architecture that dominates current AI systems. As the field explores post-transformer architectures, efficient training techniques specific to these new designs become increasingly valuable.

Limitations and Next Steps

CompreSSM has been demonstrated primarily on state-space models, and its applicability to transformer-based architectures — which power the majority of current frontier models — remains to be shown. The technique's effectiveness at the scale of frontier models (hundreds of billions of parameters) is also untested, though the mathematical principles are architecture-agnostic.

The research represents part of a broader effort at MIT and its collaborators to make AI development more computationally efficient — an increasingly important goal as the gap between the compute haves and have-nots continues to widen.

Newsletter

Get Lanceum in your inbox

Weekly insights on AI and technology in Asia.

Share

More in Research

Lanceum

Independent coverage of AI and technology across Asia. We go beyond headlines to explain what matters.

Colophon

Typeset in Space Grotesk & DM Serif Display. Built with Nuxt & Tailwind. Powered by curiosity.

© 2026 Lanceum. All rights reserved.

Independent • Rigorous • Asia-Focused