
Cosmos Predict2 ContainerNVIDIA
Run inference and post-training on Cosmos-Predict2 models for future state prediction and visual simulation.
Vendor
NVIDIA
Company Website
Product details
The Cosmos Predict2 Container is designed to run inference and post-training on Cosmos-Predict2 models. These models are part of the Cosmos World Foundation Models (WFMs) and specialize in future state prediction, often referred to as world models. Cosmos-Predict2 includes diffusion-based world foundation models for Text2Image and Video2World generation, allowing users to generate visual simulations based on text or video prompts.
Features
- Diffusion-Based Models: Utilizes diffusion-based models for Text2Image and Video2World generation.
- Future State Prediction: Predicts novel future frames given initial frames.
- Model Types: Includes Cosmos-Predict2-2B-Text2Image, Cosmos-Predict2-14B-Text2Image, Cosmos-Predict2-2B-Video2World, and Cosmos-Predict2-14B-Video2World.
- System Requirements: Requires NVIDIA GPUs with Ampere architecture or newer, Linux OS, CUDA version 12.4 or later, and Python version 3.10 or later.
- Comprehensive Inputs and Outputs: Supports multiple video modalities such as RGB, Depth, Segmentation, and more. Outputs include video and text.
Benefits
- Enhanced Visual Simulations: Generate high-quality visual simulations based on text or video prompts.
- Versatile Use Cases: Suitable for data generation, policy evaluation, data augmentation, and data curation.
- High Performance: Leverages advanced NVIDIA GPUs and CUDA for efficient processing.
- Customizable Models: Customize models in post-training to fit specific needs.
- Robust System Requirements: Ensures compatibility with modern hardware and software environments.