Logo
Sign in
Product Logo
NVIDIA cuDNNNVIDIA

The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, attention, matmul, pooling, and normalization.

Vendor

Vendor

NVIDIA

Company Website

Company Website

Product details

The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, attention, matrix multiplication (matmul), pooling, and normalization. It is designed to deliver high performance on compute-bound operations and offers heuristics for choosing the right kernel for a given problem size.

Features

  • Accelerated Learning: Provides kernels targeting Tensor Cores to deliver the best available performance on compute-bound operations.
  • Expressive Op Graph API: Users can define computations as a graph of operations on tensors, with both a direct C API and an open-source C++ frontend for convenience.
  • Fusion Support: Supports fusion of compute-bound and memory-bound operations, with common generic fusion patterns implemented by runtime kernel generation and specialized fusion patterns optimized with pre-written kernels.
  • Deep Neural Networks: Accelerates deep learning frameworks, reducing training time for technologies like autonomous vehicles and intelligent voice assistants.
  • Optimizations: Includes optimizations for important specialized patterns like fused attention and heuristics for predicting the best implementation for a given problem size.

Benefits

  • High Performance: Delivers high performance on compute-bound operations, reducing training time for deep learning models.
  • Efficiency: Optimizes memory-bound operations and supports fusion of operations to improve overall efficiency.
  • Versatility: Suitable for a wide range of applications, including computer vision, conversational AI, and recommendation systems.
  • Scalability: Supports high-performance, low-latency inference for deep neural networks in the cloud, on embedded devices, and in self-driving cars.
  • Developer-Friendly: Provides a comprehensive set of tools and APIs for developers to build and optimize deep learning models.
Find more products by category
Development SoftwareView all