Tag: AI Research (5 articles)

Introducing talkie: a 13B vintage language model from 1930

A 13B model trained exclusively on pre-1931 text aims to explore AI's reasoning, creativity, and 're-discovery' abilities within knowledge boundaries, sparking new discussions on data copyright and model purity.

Simon Willison · Apr 28, 2026

Deep Neural Nets: 33 years ago and 33 years from now

Karpathy reproduces LeCun's 1989 handwritten zip code recognition paper in PyTorch, revealing the nature of progress in deep learning over 33 years.

karpathy.github.io · Apr 5, 2026

Gemma 4: Byte for byte, the most capable open models

Google DeepMind's Gemma 4 models innovate in parameter efficiency and support multi-modal inputs, marking a significant advancement in research on small effective models.

Simon Willison · Apr 3, 2026

microgpt

Andrej Karpathy's microgpt project demonstrates how to implement a simplified GPT model from scratch in just 200 lines of Python code, revealing a trend towards minimalism in AI development.

Andrej Karpathy · Feb 12, 2026

Adversarial Attacks on LLMs

This article explores adversarial attacks on large language models (LLMs), including types of attacks, threat models, and their impact on the safety of generated text, revealing significant challenges in AI safety.

Lilian Weng · Oct 25, 2023