← Back to Home

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

Simon Willison 工具链 入门 Impact: 7/10

Simon Willison's famous 'pelican riding a bicycle' benchmark surprisingly shows a locally-run, smaller Alibaba Qwen3.6 model outperforming the cloud-based, massive Claude Opus 4.7 in creative SVG generation, revealing the surprising potential of open-source models for specific tasks.

Key Points

  • Simon Willison's 'pelican riding a bicycle' is a popular
  • informal test for AI models' visual understanding and generation capabilities.
  • A locally-run
  • 20.9GB quantized Qwen3.6-35B-A3B model on a MacBook outperformed Anthropic's latest cloud-based giant
  • Claude Opus 4.7
  • in generating an SVG of a pelican on a bicycle.
  • In a follow-up 'flamingo riding a unicycle' test
  • the Qwen model again showed superior creativity and detail (e.g.
  • adding sunglasses
  • a bowtie)
  • while Opus's output was comparatively bland.
  • This result challenges the assumption that 'bigger models and the cloud are always stronger
  • highlighting the competitiveness of open-source
  • locally-deployable models on specific creative tasks.

Analysis

"The Origin: Why Does a "Silly" Test Spark Discussion Again?

Analysis generated by BitByAI · Read original English article

Originally from Simon Willison

Automatically analyzed by BitByAI AI Editor

BitByAI — AI-powered, AI-evolved AI News