“I watched the whole nightmare through binoculars — it was a truly terrifying sight.”
This quote is from Roman, a Russian citizen overlooking Belaya air base. He is describing the swarms of Ukrainian drones that bombed Russian military assets over the weekend. The Ukrainians smuggled cheap drones costing under $10,000, used ten-year-old open-source software, slapped on some AI models, and launched them from semi-trucks directly at Russian aerial power. It is estimated that they were able to destroy roughly 41 bombers—severely damaging Russia’s long-range nuclear arsenal.
A live view from a drone bombing the plane. Image from X.
In The Leverage, I’ll frequently talk about how “AI makes intelligence cheap.” I recognize that is fluffy and abstract, so let me be clear—cheap intelligence means cheap, targeted explosions. The drones used an onboard vision model to target aircraft without human operators or GPS.
War has a way of making hazy ideas into cold reality. Giving machines the ability to “see” with vision models and “think” with reasoning models like those utilized by ChatGPT means computation can happen anywhere. We can now stick eyes and a brain in places no human could go at marginal costs. It’s the surveillance state mechanized at the scale of Nvidia’s GPUs. This change has been theorized for years, but now, finally, is here. It is a trite thing to say in technology, but this sincerely changes everything.
In the next 5 years, computer vision will become as important as LLMs are today. Today, I’d like to cover:
1) Computer vision system component costs have dropped 90% in the last 10 years. Hardware is cheap, models are open-source, and every smartphone user has a good enough camera to run vision models.
2) Five markets are rigggght on the brink of explosive growth.
Everything is cheaper
Every computer vision system consists of three components: input hardware, the vision model, and the output system where some action is triggered by the results of the vision model. So in the Russian attack, that would be the drone’s camera, the vision system, and the drone’s frame/payload.
It’s remarkable how cheap input hardware has gotten. When I worked on a computer vision project in 2013, cameras and chips were too expensive. Now, HUGE reductions in cost and improved picture quality changed that.
For a proxy, here is a dataset from the EU on consumer cameras and video equipment costs. Prices dropped steeply until Covid, where tariffs and supply chain shocks kept prices steady.
Much of this cost reduction is due to smartphone supply chains. SE Asia is home to thousands of factories, all devoted to making ever tinier electronics for iPhones. These components can be repurposed for other electronics. We’ve seen a similar reduction in semiconductor prices.
Here, this price reduction is accompanied by Moore’s Law, meaning these chips got cheaper and more powerful.
Mix these two together and you get a drone that you can control with your phone, that runs AI models on-device for a couple hundred bucks. At home, you can get a Ring camera (that outsources its computer vision to the cloud) for $85.
But it isn’t just input devices! The models have gotten dramatically better. ImageNet, the famous test for image recognition, has been effectively solved with results above 90% accuracy since 2022.
With refined techniques, research labs are focusing on scaling, with the number of large vision models tripling yearly since 2022.
The same scaling laws that have powered the LLM revolution apply to vision models—the bigger the model, the better the performance. If there isn’t data, companies like Scale ($25 billion valuation) have an army of Filipinos to label data sets for model training.
Exponential decreases in cost and increases in improvement make it stupid easy to run a vision model on any camera. This opens up a massive opportunity to change the way we work and interact with the world. For paying subscribers, here are five of the markets I’m excited about and interesting startups serving them.
Where millions lay in wait
Keep reading with a 7-day free trial
Subscribe to The Leverage to keep reading this post and get 7 days of free access to the full post archives.