Tag: Multimodal Models (10 articles)

Gemma 4 VLA Demo on Jetson Orin Nano Super

An end-to-end multimodal agent demo running on NVIDIA Jetson Orin Nano Super, showcasing how the model autonomously decides when to use the camera and answers questions with visual context, signaling the descent of powerful AI capabilities to edge devices.

Hugging Face Blog · Apr 22, 2026

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

Hugging Face releases a new tutorial demonstrating how fine-tuning multimodal embedding models can yield performance far surpassing general-purpose large models in specific domains (like visual document retrieval), even outperforming models with 4x its parameters.

Hugging Face Blog · Apr 16, 2026

Tag: Multimodal Models (10 articles)

Gemma 4 VLA Demo on Jetson Orin Nano Super

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

Multimodal Embedding & Reranker Models with Sentence Transformers

Gemma 4: Byte for byte, the most capable open models

Welcome Gemma 4: Frontier multimodal intelligence on device

Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents

Holotron-12B - High Throughput Computer Use Agent

Parsing the Unreadable: How LlamaParse Handles Legal Discovery Documents

Run Highly Efficient Multimodal Agentic AI with NVIDIA Nemotron 3 Nano Omni Using vLLM

Welcome NVIDIA Cosmos 3: The First Open Omni-model for Physical AI Reasoning and Action