IBM Watson Speech to TextIBM
Watson Speech to Text is an API that transcribes speech to text in a variety of languages. It’s available as SaaS or for self-hosting.
Vendor
IBM
Company Website
Product details
What is IBM Watson Speech to Text?
IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics. Get started fast with our advanced machine learning models out-of-the-box or customize them for your use case.
Features
- **Automatic speech recognition: **Enable your voice applications using neural technologies for speech recognition powered by IBM Watson.
- **Model training options: **Improve speech recognition accuracy for your use case with language and acoustic training options.
- **Optimized for customer care: **Activate your voice application with speech models tuned for the customer care domain.
- **Pre-trained speech models: **Activate your voice application with speech models tuned for the customer care domain.
- **Fine-tuning features: **Improve speech recognition accuracy for extracting phrases, words, letters, numbers or lists.
- **Low latency transcription: **Use our models optimized for low latency in real-time speech applications.
- **Audio diagnostics before transcription: **Analyze and correct weak audio signals before transcription begins.
- **Interim transcription before final results: **Improve application response times by using speech transcription as it is generated and throughout the finalization process.
- **Smart formatting: **Transcribe dates, times, numbers, currency values, email and website addresses in your final transcripts by converting them into conventional forms.
- **Speaker diarization: **Recognize who said what in a multi-participant voice exchange. Currently optimized for two-way call center conversations but can detect up to 6 different speakers.
- **Word spotting and filtering: **Filter for specific words or inappropriate content by using our keyword spotting and profanity filtering features. (US English only)