Development, begins together.
Banner alanı
IFM Sensor

NVIDIA Develops Open Data Architecture for Physical AI Systems

Erkan Teskancan

Kurumsal
  • OLM MUH
  • 1773943341784-108085-nvidia.jpg

    ## NVIDIA Develops Open Data Architecture for Physical AI Systems

    NVIDIA, in collaboration with Microsoft Azure and Nebius, is introducing a scalable data factory solution for physical AI systems. This architecture aims to support the development of robotics, visual AI agents, and autonomous vehicles.

    ### Objectives of the Collaboration
    This collaboration focuses on a fundamental problem in physical world applications of AI: the availability of large and high-quality training datasets. Systems like autonomous vehicles and robotic platforms require diverse datasets that include rare and safety-critical scenarios. Collecting data in real environments is resource-intensive and impractical. Therefore, manual data collection processes are being replaced with automated and computation-assisted data generation.

    ### Technical Solution and Division of Responsibilities
    The solution is built on an open architecture developed based on the NVIDIA Physical AI Data Factory Blueprint. Data operations are structured in three modular stages: orchestration, augmentation, and evaluation.

    • NVIDIA provides data processing, synthetic data generation, and validation software, including Cosmos-based tools. These systems expand limited datasets with simulation and generative models.
    • Microsoft Azure integrates the blueprint architecture into its cloud-based digital infrastructure, supporting enterprise-scale machine learning workflows with services such as IoT operations, data platforms, and real-time analytics.
    • Nebius provides infrastructure integrations such as GPU-accelerated computing (RTX PRO 6000 Blackwell class systems), object storage, and serverless execution.

    The architecture operates with workflows that automatically process raw datasets, generate synthetic data variations including long-tail scenarios, and validate outputs with inferential evaluation models. This reduces manual intervention and standardizes data quality.

    ### Orchestration with NVIDIA OSMO
    The NVIDIA OSMO layer manages distributed workloads, automating resource allocation and pipeline execution. Its integration with AI coding agents enables industrial-scale automation of data workflows.

    ### Implementation and Setup
    The Blueprint is implemented particularly in cloud environments, allowing developers to configure data pipelines without building custom infrastructure. It is offered as part of the open physical AI toolchain on Azure and integrates with enterprise IT systems. Nebius makes the architecture available in its AI cloud and provides managed inference and data services for production-level pipelines.

    ### Early Adopters and Use Cases
    Leading organizations such as FieldAI, Hexagon Robotics, Linker Vision, and Teradyne Robotics are testing the system in perception, mobility, and reinforcement learning workflows. Video analytics and robotic applications are also supported on Nebius infrastructure.

    ### Industrial Use Cases
    • Training for edge-case scenarios in autonomous driving systems
    • Industrial and service robots using reinforcement learning
    • Visual AI agents used in surveillance and analytics

    Key use cases include simulation-based training, validation of perception models under variable conditions, and continuous dataset expansion for adaptive systems.

    ### Results and Expected Impact
    This collaboration enables a shift from data collection to data generation. Training datasets are systematically created using computational resources. This reduces reliance on real-world data while increasing the coverage of rare scenarios.

    Operationally, the framework enhances the scalability, repeatability, and verifiability of AI training pipelines. Through the integration of cloud-based digital infrastructure and automated workflows, faster model cycles and consistent performance across complex environments are achieved.
     
    Back
    Top