Vendor

NVIDIA

Company Website

nvidia.com

Product details

NVIDIA® TensorRT™ is an ecosystem of tools for developers to achieve high-performance deep learning inference. TensorRT includes inference compilers, runtimes, and model optimizations that deliver low latency and high throughput for production applications. The TensorRT ecosystem includes the TensorRT compiler, TensorRT-LLM, TensorRT Model Optimizer, TensorRT for RTX, and TensorRT Cloud.

Features

High Performance: Speed up inference by 36X compared to CPU-only platforms using NVIDIA CUDA parallel programming model.
Model Optimization: Includes quantization, layer and tensor fusion, and kernel tuning techniques. Supports FP8, FP4, INT8, INT4, and advanced techniques such as AWQ.
Large Language Model Inference: TensorRT-LLM accelerates and optimizes inference performance of large language models with a simplified Python API.
Cloud Compilation: TensorRT Cloud generates hyper-optimized engines for given constraints and KPIs, automatically determining the best engine configuration.
Framework Integrations: Direct integration with PyTorch and Hugging Face for faster inference. ONNX parser for importing models from popular frameworks.
Deployment and Scaling: TensorRT-optimized models are deployed, run, and scaled with NVIDIA Dynamo Triton inference-serving software.

Benefits

Optimized Performance: Delivers low latency and high throughput for production applications.
Efficiency: Reduces memory bandwidth and latency, essential for real-time services and embedded applications.
Versatility: Suitable for a wide range of applications, including intelligent video analytics, speech AI, recommender systems, and AI-based cybersecurity.
Scalability: Supports deployment across edge, laptops, desktops, and data centers.
Developer-Friendly: Provides a unified path to deploy AI models with rich APIs and reusable code.

Find more products by industry

Professional Services Information & Communication View all

Find more products by category

Development Software Marketing Software View all