← BACK TO HOME — Simon Willison — 进阶
工具链 · ANALYSIS · IMPACT 7/10

Have your agent record video demos of its work with shot-scraper video

Simon Willison introduces shot-scraper video, a command that lets AI agents record web application demos via YAML scripts, signaling a shift in AI development toolchains from 'generating code' to 'generating verifiable deliverables.'

KEY POINTS
  • The new shot-scraper video command uses YAML scripts to define action sequences and automatically records demo videos of web application interactions.
  • The core value of this tool is enabling AI agents to autonomously generate visual, verifiable demonstrations of their work.
  • The example YAML storyboard was entirely generated by GPT-5.5, showcasing AI's ability to use its own toolchain in a closed loop.
  • This reveals a trend: developer tools are embedding 'self-documentation' and 'self-demo' capabilities, becoming 'skill modules' for AI agents.
ANALYSIS

Simon Willison recently released version 1.10 of his shot-scraper tool, introducing a new command called shot-scraper video. This command reads a YAML storyboard file, follows the defined steps to automatically operate a browser, and records the entire process as a video. It might look like a simple automation recording tool, but if you consider the context he mentioned—'having coding agents produce demos of their work'—it sends a strong signal: the workflow of AI agents is extending from 'writing code' to 'delivering verifiable product demos.'

The Cause: Why do AI agents need to 'record videos'? In the era of AI-assisted programming (especially agent-based programming), a recurring pain point is: after an AI writes the code, how can it prove that the feature it built actually works and is user-friendly? Traditional unit tests verify logic, but they can't intuitively show how to use a feature or what the experience is like. Users and teams need a 'demo.' Manually recording videos is inefficient, and shot-scraper video aims to let the AI agent handle this task itself.

Deconstruction: What does it change? The key innovation of this toolchain isn't about 'using Playwright for screen recording' (Playwright itself can do that), but rather packaging the entire process into an interface that's extremely friendly for AI agents:

  1. YAML Storyboards as 'Executable Requirement Documents': It uses declarative YAML to define interaction scenarios (e.g., 'click button A, wait for text B to appear, enter content C'). This is closer to how humans describe requirements than raw code scripts, and simultaneously becomes a precise, machine-readable 'test case' or 'demo plan.'
  2. Command-Line Help as a Skill File: Simon specifically notes that an AI agent (like GPT-5.5) can learn how to use the tool just by reading the output of shot-scraper video --help. This is essentially embedding the tool's 'user manual' (like a SKILL.md) directly into the --help output, creating a powerful self-descriptive pattern. It means tool developers only need to write good --help text for AI agents to use their tools out-of-the-box.
  3. A Closed-Loop Agent Workflow: In the example, a GPT-5.5 agent reads the changes in a code branch, then autonomously invokes the shot-scraper video command, writes the YAML storyboard, starts the development server, and finally generates the demo video. This completes the full loop of 'understand requirements -> write code -> verify functionality -> generate demo,' transforming the agent's role from a 'coder' to a 'full-stack engineer and product demo specialist.'

Trend Insight: The 'Agenticization' and 'Demoization' of Dev Tools This event reveals a deeper trend: the core interface of future developer tools will no longer be GUIs or lengthy docs designed for human programmers, but 'skill descriptions' designed for AI agents, accessible via simple command-line interactions. shot-scraper video is just the tip of the iceberg; Simon mentions other tools like showboat and rodney following the same pattern. Tools themselves are becoming 'smarter'; they know how to 'introduce themselves' to AI agents. Meanwhile, the range of tool outputs is expanding: from 'code' and 'logs' to richer verification materials like 'video demos' and 'interactive reports' aimed at end-users or decision-makers. This will make AI-driven development processes more transparent, trustworthy, and better able to demonstrate their business value.

Practical Value & Takeaways for Developers For our audience—AI practitioners and developers—this means:

  1. A New Dimension for Evaluating AI Toolchains: When choosing or designing tools, consider whether they have the potential for 'self-demonstration' and whether they provide sufficiently clear --help or API documentation for AI agents to understand. This will become a key metric for a tool's 'integrability.'
  2. Optimize Your Agent Workflows: If you're building an AI coding agent, consider incorporating 'demo generation' as a standard output step. This not only better verifies the agent's work, but the generated demo video itself becomes a highly persuasive report or product document.
  3. Learn this 'Self-Descriptive' Design Philosophy: Whether writing CLI tools or designing APIs, think about how to make documentation and help information more structured and machine-parseable. This can greatly enhance the longevity and utility of the tools you develop within the AI ecosystem.

A Counter-Intuitive Perspective Most people might think 'AI agents recording videos' is just a flashy gimmick. But what Simon is really trying to solve is the 'black box trust' problem with AI-generated code. A video demo is a more intuitive 'certificate of trust' than a test report, one that's closer to the final product form. It transforms the AI's work from 'invisible logic' into 'visible experience,' which is crucial for teams to adopt AI coding and integrate AI agents into serious software development processes. This isn't just automation; it's the embodiment of explainability and deliverability at the level of development toolchains.

Analysis by BitByAI · Read original

Originally from Simon Willison · Analyzed by BitByAI