
Riva Speech Skills is a scalable Conversational AI service platform with pre-trained models for real-time performance.
Vendor
NVIDIA
Company Website
NVIDIA Riva is a GPU-accelerated SDK for building Speech AI applications that are customized for your use case and deliver real-time performance. Riva offers pre-trained speech models in NVIDIA NGC that can be fine-tuned with NVIDIA NeMo on a custom data set, accelerating the development of domain-specific models. Supported NeMo models can be easily exported, optimized, and deployed as a speech service on premises or in the cloud with a single command using Helm charts. Riva’s high performance inference is powered by NVIDIA TensorRT optimizations and served using the NVIDIA Triton Inference Server. Riva services are available as gRPC-based microservices for low-latency streaming, as well as high-throughput offline use cases. Riva is fully containerized and can easily scale to hundreds and thousands of parallel streams.
Features
- Pre-trained Models: Offers pre-trained speech models that can be fine-tuned with NVIDIA NeMo.
- High Performance Inference: Powered by NVIDIA TensorRT optimizations and served using NVIDIA Triton Inference Server.
- gRPC-based Microservices: Available as gRPC-based microservices for low-latency streaming and high-throughput offline use cases.
- Scalability: Fully containerized and can scale to hundreds and thousands of parallel streams.
- Easy Deployment: Models can be exported, optimized, and deployed with a single command using Helm charts.
Benefits
- Real-Time Performance: Delivers real-time performance for speech AI applications.
- Customizable: Allows customization of speech models for specific use cases.
- Scalable: Easily scales to meet the demands of large-scale deployments.
- Efficient: Provides efficient deployment and management of speech services.