sqlite AGENTS.md
The SQLite project's AGENTS.md file explicitly rejects AI-generated code while creating a dedicated channel for AI-reported bugs, revealing a pragmatic strategy for open-source communities to handle the AI wave.
The SQLite project's AGENTS.md file explicitly rejects AI-generated code while creating a dedicated channel for AI-reported bugs, revealing a pragmatic strategy for open-source communities to handle the AI wave.
The rise of AI-assisted security research is putting unprecedented pressure on foundational open-source projects like curl with a flood of high-quality vulnerability reports, revealing the double-edged sword of AI in security.
A critical security flaw in Microsoft Copilot Cowork allows attackers to use prompt injection to trick the AI agent into exfiltrating sensitive files like OneDrive data using the user's own permissions.
The UK's NHS closed its open-source repositories due to security vulnerabilities, prompting a public rebuke from the Government Digital Service and sparking a deeper debate on open-source strategy in the AI era.
Mozilla leveraged the Claude Mythos preview and advanced harnessing techniques to find and fix 423 Firefox security vulnerabilities in one month—a 20x increase over their average—marking a qualitative shift in AI security auditing from noise generation to high-value signal production.
The UK's AI Security Institute found GPT-5.5's cyber capabilities for finding vulnerabilities are comparable to the leading Claude Mythos model, but its general availability marks a new phase in AI-driven cybersecurity offense and defense.
Mozilla's CTO reports that using Anthropic's Claude AI, Firefox identified and fixed 271 vulnerabilities in an assessment, marking a shift where AI moves from an 'assistant' to a 'lead' role in security defense.
The system prompt update for Claude Opus 4.7 reveals the evolution of AI assistants from passive responders to proactive tool-users, deep task executors, and more responsible safety frameworks.
OpenAI launches GPT-5.4-Cyber, a model fine-tuned for defensive cybersecurity, and its "Trusted Access" program, signaling that leading AI companies are making cybersecurity a key battleground while seeking a new balance between safety and openness.
AI security reviews reveal that system security is evolving into an economic game: defenders must spend more computational resources (tokens) than attackers to ensure safety, which unexpectedly boosts the value of open-source projects.
Reward hacking presents challenges in reinforcement learning due to flaws in reward functions, particularly impacting language models, necessitating further research and mitigation strategies.
This article explores the phenomenon of extrinsic hallucinations in large language models, analyzing their causes and detection methods, and proposes effective strategies to reduce hallucinations while emphasizing the risks of knowledge updates.
This article explores adversarial attacks on large language models (LLMs), including types of attacks, threat models, and their impact on the safety of generated text, revealing significant challenges in AI safety.
Anthropic's 'Project Glasswing', leveraging its latest AI model Mythos Preview, has helped partners discover over ten thousand high or critical-severity vulnerabilities in one month, shifting the core bottleneck in software security from 'finding vulnerabilities' to 'verifying and patching them'.