Join the 155,000+ IMP followers

www.ptreview.co.uk

Multimodal Vision Module for Robotics Platforms

STMicroelectronics and Leopard Imaging introduce a Jetson-compatible module combining imaging, depth sensing, and motion perception to support robotics vision and physical AI development.

  www.st.com
Multimodal Vision Module for Robotics Platforms

STMicroelectronics and Leopard Imaging have released a multimodal vision module designed for humanoid and advanced robotics, integrating imaging, depth sensing, and motion tracking within a compact, power-constrained architecture suitable for edge AI systems.

Integrated Vision for Robotics Systems
The module addresses a key requirement in robotics vision: the ability to combine multiple sensing modalities into a unified, synchronized data pipeline. By integrating 2D imaging, 3D depth sensing, and inertial motion tracking, the system enables robots to perceive and interpret their environment with greater contextual awareness.

This approach supports applications in manufacturing, automotive production, logistics, warehousing, and service robotics, where machines must operate in dynamic, unstructured environments. The design aligns with the broader evolution of the automotive data ecosystem and industrial robotics, where sensor fusion is critical for real-time decision-making.

The module is built to meet size, weight, and power (SWaP) constraints typical of humanoid robots, where distributed sensing must coexist with onboard processing and actuation.

Native Integration with NVIDIA Robotics Ecosystem

A central feature of the module is its compatibility with NVIDIA’s robotics stack. The integration of the NVIDIA Holoscan Sensor Bridge enables multi-gigabit data transfer over Ethernet, allowing real-time ingestion of high-bandwidth sensor data into NVIDIA Jetson platforms.

This architecture simplifies system design by providing plug-and-play connectivity and reducing the need for custom sensor interfacing. The module is also fully supported by the NVIDIA Isaac platform, which includes AI models, simulation tools, and development libraries.

The availability of APIs, build systems, and pre-configured AI algorithms allows developers to accelerate deployment cycles. Simulation tools, including domain randomization and sensor modeling, are designed to reduce the gap between virtual training environments and real-world robot behavior, a known bottleneck in robotics development.

Sensor Fusion Architecture and Technical Specifications
The module combines three primary sensing technologies, each addressing a specific aspect of robotic perception:

Vision-Based Sensing - The system integrates the VB1940 RGB-IR image sensor, featuring 5.1-megapixel resolution and support for both rolling and global shutter modes. This dual-mode capability enables capture of fast-moving objects without distortion while maintaining image quality under varying lighting conditions.

An additional sensor variant, V943 from the BrightSense family, is available for industrial and mass-market applications in monochrome or RGB-IR configurations.

Motion Sensing - Motion tracking is handled by the LSM6DSV16X inertial measurement unit (IMU), which includes a 6-axis sensor and embedded machine-learning core. The IMU supports edge AI processing through its machine-learning core (MLC), along with sensor-fusion low-power (SFLP) capabilities.

The inclusion of Qvar electrostatic sensing enables user-interface detection, supporting interaction-based robotics applications.

3D Depth Sensing - Depth perception is provided by the VL53L9CX direct Time-of-Flight (dToF) LiDAR module. The sensor delivers ranging up to 9 meters and operates with a resolution of 54 × 42 zones (approximately 2,300 zones).

With a 55° × 42° field of view and angular resolution of approximately 1°, the system supports detection of small objects and accurate spatial mapping. The module operates at up to 100 frames per second, enabling real-time 3D scene reconstruction.

Enabling Physical AI and Robotics Deployment
The integration of synchronized sensing modalities with high-bandwidth data transfer enables robots to process environmental data in real time. This capability is essential for physical AI systems, where perception, reasoning, and action must occur within tight latency constraints.

By standardizing sensor integration within the NVIDIA ecosystem, the module reduces development complexity and supports scalable deployment across robotics platforms. The combination of hardware, software support, and simulation tools provides a framework for faster iteration and validation in robotics vision systems.

This development reflects a broader industry shift toward tightly integrated sensor and compute platforms, where interoperability and data consistency are critical to advancing autonomous and semi-autonomous machines.

Edited by an industrial journalist, Sucithra Mani, with AI assistance.

www.st.com

  Ask For More Information…

LinkedIn
Pinterest

Join the 155,000+ IMP followers