Vision-Language Models Tutorial

How AI Is Transforming Localization for Live Content

AI/ML made swift subtitling de rigueur on YouTube and for videoconferencing several years ago. But now LLMs are changing the game for real-time translation and localization and language-mapped ...

Nvidia's Alpamayo AI platform for autonomous cars will debut in the new Mercedes-Benz CLA

Nvidia describes Alpamayo as an open portfolio of reasoning vision-language-action (VLA) models, simulation tools, and datasets designed to power robots, industrial automation, and Level 4 autonomous ...

16h

Nvidia’s Cosmos Reason 2 aims to bring reasoning VLMs into the physical world

Nvidia's roadmap plans to bring agentic AI from the digital space to the physical world with the release of new physical ...

1don MSN

Nvidia launches Alpamayo, open AI models that allow autonomous vehicles to think like a human

Nvidia unveiled Alpamayo at CES 2026, which includes a reasoning vision language action model that allows an autonomous ...

IEEE

Vision-Language Models for 3D Scene Understanding: Applications and Developments

Abstract: This article focuses on the applications and advances of Visual Language Modeling (VLM) in 3D scene understanding. The article details several mainstream visual language models and analyzes ...

IEEE

LightSTATE: A Generalized Framework for Real-Time Human Activity Detection Using Edge-Based Video Processing and Vision Language Models

Abstract: Human activity detection plays a vital role in applications such as healthcare monitoring, smart environments, and security surveillance. However, traditional methods often rely on ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results