Codex CLI 0.128.0 adds /goal

Why It Matters: Why a CLI Tool Update is Worth Discussing On the surface, this seems like a minor version bump for a command-line tool. But if you're tracking the engineering of AI agents, this is a significant signal. OpenAI's Codex CLI—a tool that lets developers use AI coding capabilities from the terminal—has added a /goal command in version 0.128.0. This isn't just a new parameter; it introduces a new paradigm for making AI agents "work persistently until completion." As AI applications shift from "chatting" to "doing," this ability for an agent to autonomously loop and drive a task forward represents the cutting edge of current engineering exploration. Deconstruction: What Exactly Does /goal Do? In simple terms, previously, if you asked Codex to write a piece of code, it would finish and stop. Now, you can use a command like /goal "fix all unit test failures" to give it an objective. After that, the agent enters a loop: it performs an action (e.g., modifying a piece of code), then automatically assesses whether the current state is closer to the goal. If the goal isn't met, it decides what to do next and continues executing. This loop persists until the agent itself determines the goal is achieved, or your configured "token budget" is exhausted. The core implementation is clever: it primarily works by automatically injecting two prompt templates (continuation.md and budget_limit.md) into the context at the end of each conversational turn. This is fundamentally an act of prompt engineering—using meticulously designed instructions to "teach" the model how to be self-driven. Trend Insight: The Evolution of Agents from "Tools" to "Collaborators" This feature reveals a deeper trend: AI agents are evolving from passive "instruction-execution tools" to active "task collaborators." 1. Persistent Task State: Traditional AI interaction is "stateless"—a single question-and-answer. The /goal command introduces a persistent "task objective," giving the agent a "mission" and an endpoint. 2. Self-Assessment and Planning: The agent must constantly evaluate "How far am I from the goal?" and plan the next step. This is a crucial step toward true autonomy, similar to how humans continuously reflect and adjust when working on complex projects. 3. Resource Awareness: The inclusion of a budget (token budget) means agents are being designed to work within resource constraints, aligning more closely with real-world software development scenarios. The "Ralph loop" mentioned by Simon Willison (a concept popularized by developer Geoffrey Huntley) is precisely the namesake for this kind of persistent execution loop. Now that mainstream tools are building in this pattern, it indicates the concept is moving from experimental idea to standardized functionality. Practical Value: What Does This Mean for Developers? For developers using AI-assisted programming, this means you can delegate more complex, ambiguous, and long-running tasks to the AI. For instance, instead of telling the AI step-by-step to "split this function" or "add an interface to that class," you can set a goal like "refactor this module for better testability." The agent will then explore the codebase, attempt modifications, run tests, and iterate on its own. This can significantly boost efficiency when dealing with messy, sprawling tasks. Of course, it also sets new requirements for developers: you need to learn how to set clear, evaluable "goals" and configure budgets reasonably to prevent the AI from falling into meaningless loops or incurring excessive costs. Counterintuitive/Unexpected: A Simple Feature with Complex Trade-offs An easily overlooked point is that this feature is entirely driven by prompts, not by complex underlying architectural changes. This demonstrates that at the current stage, sophisticated agent behaviors can be achieved on existing models through clever prompt engineering. However, this also raises concerns: automatic loops could lead to失控的成本. Therefore, the "budget limit" is not an optional extra but a core safety valve. This reminds us that while embracing the automation capabilities of agents, cost control and task scoping will become new core skills for developers. In the future, we will likely see more best practices emerge on "how to set effective goals for AI agents."