Multiverse Computing, a Spanish firm, announced on Thursday that it has raised a massive €189 million (about $215 million) Series B round thanks to a technology it calls “CompactifAI.”
According to the business, CompactifAI is a compression approach inspired by quantum computing that may reduce LLM sizes by up to 95% without affecting model performance.
In particular, Multiverse provides condensed versions of popular, open-source LLMs, mostly tiny models, like Mistral tiny 3.1, Llama 3.3 70B, Llama 3.1 8B, and Llama 4 Scout. The business stated that it is developing more open source and reasoning models and intends to release a version of DeepSeek R1 shortly. OpenAI and other proprietary models are not supported.
Its so-called “slim” variants can be licensed for on-premise use or purchased through Amazon Web Services. According to the business, its models reduce inference costs by 50% to 80% and are 4x to 12x faster than equivalent non-compressed counterparts. For example, according to Multiverse, the price of Lama 4 Scout Slim on AWS is 10 cents per million tokens, whereas Lama 4 Scout costs 14 cents.
Some of its versions, according to Multiverse, can be made so small and energy-efficient that they may be used on PCs, phones, automobiles, drones, and even the Raspberry PI, the preferred mini PC for do-it-yourselfers. (Suddenly, we’re picturing those fanciful Raspberry PI Christmas light houses that have been enhanced with interactive talking Santas that are powered by LLM.)
There is some technical power behind Multiverse. Its CTO, Román Orús, a professor at the Donostia International Physics Center in San Sebastián, Spain, was one of its co-founders. Orús is renowned for his groundbreaking work on tensor networks, which are distinct from Google’s other Tensor-related AI initiatives.
Tensor networks are quantum-like computing tools that operate on standard computers. These days, compressing deep learning models is one of their main applications.
Enrique Lizaso Olmos, the CEO of Multiverse and another co-founder, has taught at a university and has several degrees in mathematics. He is mainly renowned for having served as the deputy CEO of Unnim Bank throughout most of his banking career.
Bullhound Capital, which has supported businesses like Spotify, Revolut, DeliveryHero, Avito, and Discord, led the Series B. The round also included participation from Toshiba, CDP Venture Capital, Santander Climate VC, HP Tech Ventures, SETT, Forgepoint Capital International, and Capital Riesgo de Euskadi – Grupo SPR.
According to Multiverse, it has 100 clients worldwide, including the Bank of Canada, Bosch, and Iberdrola, and 160 patents. To date, it has raised over $250 million with this capital.
The beautiful thing about Multiverse Computing is that it is the leading open-source large language models, such as Llama 4 Scout, Llama 3.3 70B, Llama 3.1 8B, and Mistral Small 3.1, are excellent at being compressed by CompactifAI. DeepSeek R1 and other reasoning models will be available shortly. The method achieves up to 95% compression with a little 2–3% accuracy decrease, in contrast to proprietary models from OpenAI and comparable vendors. It allows for 4–12 times quicker inference, 50–80% lower inference costs, 84% more energy efficiency, and a 25% increase in inference speeds while halving training times.
In contrast to traditional techniques that minimize neurons or parameters, CompactifAI concentrates on the correlation space of the model, substituting Matrix Product Operators (MPOs) for trainable weights via successive Singular Value Decompositions. This method produces significantly smaller, faster, and more energy-efficient models by allowing for an exponential decrease in memory requirements while preserving polynomial computing complexity.
The practical impact is substantial: models compressed with CompactifAI achieve 4–12 times quicker inference speeds, reduce inference costs by 50–80%, and improve energy efficiency by 84%, all while seeing only a 2–3% reduction in accuracy. Inference speeds are increased by 25% while training times are halved. Multiverse’s Llama 4 Scout Slim model, for instance, costs only 10 cents per million tokens on AWS, whereas the original version costs 14 cents.
CompactifAI models are offered via edge computing for low-resource devices, on-premises licensing for businesses needing data sovereignty, and cloud deployment on Amazon Web Services. Multiverse’s technology is already being used by more than 100 clients in ten different industries, including the European Tax Agency, Bosch, Bank of Canada, BBVA, and Iberdrola. The business has 160 patents in AI and quantum technologies and has been named a “Gartner Cool Vendor” for quantum software technology in financial services.
Competitors of Multiverse Computing include Classiq, SandboxAQ, QpiAI, Terra Quantum, 1QBit, Zapata AI, CogniFrame, Quantum Mads, and Quantum Motion. Multiverse distinguishes out for its quantum-inspired compression for huge language models, even if these businesses concentrate on different facets of quantum and AI integration. CompactifAI delivers up to 95% compression with only 2–3% accuracy loss, whereas traditional compression algorithms usually result in 20–30% accuracy loss with 50–60% compression rates.
In quest to democratize AI and influence the upcoming ten years, compactifAI’s cost and efficiency advantages have the potential to democratize access to AI by enabling deployment in resource-constrained contexts and making sophisticated language models affordable for smaller organizations. The market for AI inference is expected to reach $255 billion by 2030, up from $106 billion in 2025, and Multiverse is ideally positioned to profit greatly from this quickly growing industry.
Beyond affordability and ease of use, CompactifAI significantly lowers AI’s energy footprint, addressing pressing environmental issues and supporting international sustainability objectives. In contexts with constrained computational resources, such as remote operations, mobile devices, IoT applications, and autonomous cars, the technology’s scalability and flexibility enable real-time decision-making and resource optimization.
The success of Multiverse shows the usefulness of quantum-inspired algorithms on classical computers, which could help close the gap until widely accessible fault-tolerant quantum computers are developed.
With a fast growing customer base and $215 million in new funding, Multiverse Computing is spearheading a technical revolution that will radically transform the deployment of AI models, making them more accessible, economical, and efficient across sectors and applications globally.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.