
A relatively unknown AI startup just made one of the boldest claims the industry has seen this year and if the numbers hold up, it could reshape how enterprises think about video AI.
Bellevue-based startup Perceptron has launched Perceptron Mk1, a new multimodal AI model designed specifically for video analysis and “physical reasoning,” with pricing the company says is 80–90% cheaper than competing offerings from OpenAI, Anthropic, and Google.
That alone would be enough to grab attention.
But what’s making people in the AI industry pay closer attention is not just the pricing, it’s the type of AI Perceptron is trying to build.
Unlike most mainstream models, which are still largely optimized around text and static images, Mk1 is focused on understanding the physical world across time. The model is designed to process live or recorded video streams, track objects across frames, reason about movement and physics, and generate structured outputs like timestamps, counts, spatial annotations, and event detections.
In practical terms, this means the system can do things like:
- Detect suspicious behaviour in surveillance footage
- Automatically clip highlights from sports broadcasts
- Monitor manufacturing defects in real time
- Track robotics movement
- Analyse safety compliance on construction sites
And it can reportedly do all of this at a fraction of the cost of frontier multimodal models.
Perceptron says Mk1 costs around $0.15 per million input tokens and $1.50 per million output tokens, dramatically below the pricing tiers associated with leading high-end video-capable AI systems.
That pricing strategy is important because video AI has historically been extremely expensive to scale.
Analysing text is relatively lightweight compared to analysing continuous video streams, especially when reasoning across time and motion. Most enterprises simply couldn’t justify running advanced AI analysis continuously across warehouses, factories, retail cameras, or industrial systems.
Perceptron is betting that lowering those costs changes the equation entirely.
And the company appears confident enough to directly benchmark itself against some of the biggest names in AI.
According to benchmark results released alongside the launch, Mk1 outperformed or matched models from OpenAI, Google, and Anthropic across several video and spatial reasoning tests, including EmbSpatialBench and VSI-Bench.
One particularly striking result involved referring-expression comprehension essentially how well the model understands instructions about specific objects inside scenes where Perceptron claimed Mk1 dramatically outperformed competing frontier systems.
Of course, benchmarks should always be treated carefully.
Many of the numbers come directly from Perceptron itself, and the model has not yet undergone the same level of independent testing as larger rivals. Still, even cautious observers appear intrigued by the architecture and economics behind the launch.
And that architecture is part of what makes Mk1 interesting.
Most vision-language models today still process video somewhat inefficiently, often treating footage as disconnected image frames. Perceptron says Mk1 was designed specifically around temporal continuity, allowing it to maintain object identity and reason across motion over time rather than just analysing isolated snapshots.
That’s a much harder problem.
It moves AI closer to understanding the real world the way humans do through sequence, motion, and physical interaction.
Perceptron calls this “physical AI,” a term increasingly used to describe models capable of reasoning about real-world environments rather than purely digital information.
And that may be where the industry is heading next.
For years, the AI race centred around language: chatbots, search, coding, and reasoning. But the next major frontier increasingly looks physical robotics, industrial automation, autonomous systems, video understanding, and spatial intelligence.
That’s a market potentially worth trillions.
Because once AI systems can reliably interpret the physical world, they become useful far beyond office productivity. They become useful in factories, logistics networks, hospitals, transportation systems, warehouses, and security operations.
And whoever solves that affordably gains a huge advantage.
The founding story behind Perceptron also explains some of its ambitions.
The company was founded by former Meta FAIR researchers Armen Aghajanyan and Akshat Shrivastava, both of whom previously worked on multimodal architectures and efficiency research at Meta.
That pedigree matters because efficiency is quickly becoming one of the most important battlegrounds in AI.
The first phase of the AI boom rewarded whoever could build the most capable models.
The next phase may reward whoever can deliver those capabilities cheaply enough for mass deployment.
And that’s exactly the opening Perceptron is targeting.
The company is also pursuing a dual strategy: Mk1 remains proprietary and API-based for enterprises, while smaller models in its “Isaac” series are being released as open-weight alternatives optimized for edge devices and lower-latency applications.
That split reflects another growing trend in AI: companies increasingly balancing open ecosystems with premium closed models.
Still, challenges remain.
Video AI introduces major concerns around surveillance, privacy, governance, and reliability. A model capable of understanding live camera feeds at scale can be enormously useful — but also deeply invasive if deployed irresponsibly.
And like all AI systems, the real test won’t be demos or benchmarks.
It will be deployment in messy real-world environments.
For now, though, Perceptron has accomplished something important:
It forced the industry to pay attention.
Because if a relatively small startup can deliver frontier-level video reasoning at radically lower cost, it signals something much bigger than one product launch.
It signals that the next major AI war may not be fought over chatbots at all.
It may be fought over who can teach machines to understand the real world fastest and cheapest.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.







