Deep Neural Nets: 33 years ago and 33 years from now
Karpathy reproduces LeCun's 1989 handwritten zip code recognition paper in PyTorch, revealing the nature of progress in deep learning over 33 years.
Karpathy reproduces LeCun's 1989 handwritten zip code recognition paper in PyTorch, revealing the nature of progress in deep learning over 33 years.
Gemma 4 introduces enhanced multimodal capabilities, supporting image, text, and audio inputs, significantly improving model intelligence and deployment flexibility across devices.
Ulysses Sequence Parallelism addresses the challenges of training large language models with long sequences, significantly enhancing the capability to process million-token contexts.
The application of diffusion models in video generation reveals challenges in temporal consistency and data requirements.
High-quality human data is crucial for modern deep learning model training, and this article explores the factors influencing data quality and methods for optimization.
This article explores adversarial attacks on large language models (LLMs), including types of attacks, threat models, and their impact on the safety of generated text, revealing significant challenges in AI safety.
Karpathy reproduces the 1989 LeCun paper on deep learning, revealing the evolution of deep learning technology and potential future directions.