DeepSeek V4 - almost on the frontier, a fraction of the price

Why are we talking about DeepSeek V4 now? Less than six months after V3.2, DeepSeek's return with the V4 series sends a clear signal: the iteration speed of open-source models is accelerating. More crucially, however, it dropped a bombshell on the "price" dimension. While frontier models like GPT-5.5 and Claude Opus 4.7 charge $25-30 per million output tokens, DeepSeek V4 Pro's output costs just $3.48, and Flash is a mere $0.28. This isn't just "slightly cheaper"—it's an order-of-magnitude difference, forcing the entire industry to rethink the cost structure of "paying for intelligence." What exactly is it, and how capable is it? Architecturally, the V4 series follows a "brute force" approach. Both Pro and Flash are Mixture-of-Experts (MoE) models with massive total parameters (Pro reaches 1.6 trillion), but only a subset is activated during inference (49 billion for Pro, 13 billion for Flash). This is key to maintaining high performance while controlling inference costs. Both support an ultra-long context of 1 million tokens and use the permissive MIT open-source license, meaning anyone can freely use, modify, or even commercially exploit them, laying the foundation for ecosystem growth. From Simon Willison's hands-on test (generating an SVG of a "pelican riding a bicycle"), V4 Flash performed quite well, with a reasonably structured bicycle and a recognizable pelican. Ironically, the more expensive V4 Pro produced a pelican with significant distortions (an oversized body, only one wing). This interesting example illustrates that model price does not correlate linearly with output quality. The pricier Pro isn't always superior to the cheaper Flash in following complex instructions or generating fine details. This reminds developers that in real-world applications, they need to conduct cost-performance trade-off tests for specific tasks rather than blindly choosing the most expensive model. Trend Insights: What does this reveal? The release of DeepSeek V4 marks that "high cost-performance open-source models" are becoming a distinct and powerful赛道. It reveals several deeper trends: First, the MoE architecture has become an industrial standard for balancing performance and cost. Whether it's open-source DeepSeek, Mixtral, or closed-source Gemini, all are adopting MoE. This foreshadows that future competition in large models will increasingly focus on "how to design more efficient expert routing and activation strategies." Second, the "good enough" strategy of open-source models is eroding the moats of closed-source models. For most enterprise applications (e.g., customer service, document processing, code assistance), V4 Flash's performance is likely more than sufficient, and its cost advantage is overwhelming. This will force companies like OpenAI and Anthropic not only to compete on absolute intelligence ceilings but also to respond on cost-effectiveness. Third, Chinese AI labs are demonstrating strong engineering and cost-control capabilities in the open-source model arena. Successive competitive open-source releases from DeepSeek, Zhipu (GLM), and Moonshot (Kimi) are securing increasingly important positions in the global open-source ecosystem. This is not just a technology race but also a competition for ecosystems and standards. Practical Value: What does this mean for me? For developers and enterprise tech decision-makers, the arrival of V4 means: 1. Re-evaluate your AI cost model. If your applications heavily rely on GPT-4-class models, migrating to V4 Pro or Flash could directly bring an order-of-magnitude cost reduction. It's time to do the math. 2. Open-source models become a serious alternative. The MIT license means zero legal risk. You can deploy the model on-premises, fully controlling data privacy and costs. This is crucial for sensitive industries like finance and healthcare. 3. A "hybrid calling" strategy becomes more attractive. You can use Flash for simple, high-concurrency requests, Pro for more complex tasks, and even call top-tier closed-source models for critical steps. This flexible architecture maximizes cost-effectiveness. 4. Pay attention to its ecosystem and toolchain. While DeepSeek's API pricing is low, the model weights are open. Community tools and cloud services for fine-tuning, quantization, and deployment will likely emerge, potentially lowering your total cost of ownership. Counterintuitive/Unexpected Points One potentially overlooked aspect is: Is such low pricing sustainable? This might be an aggressive strategy by DeepSeek to quickly capture market share and developer mindshare. While enjoying the dividend of low prices, users should also consider the supplier's long-term stability. Another surprise is that, as shown by the SVG test, the most expensive model isn't always the best. This challenges the simple "you get what you pay for" notion and emphasizes the importance of A/B testing and evaluation in real-world scenarios.