Logo
Sign in
Product Logo
IBM Watson Speech to TextIBM

Watson Speech to Text is an API that transcribes speech to text in a variety of languages. It’s available as SaaS or for self-hosting.

Vendor

Vendor

IBM

Company Website

Company Website

Product details

What is IBM Watson Speech to Text?

IBM Watson® Speech to Text technology enables fast and accurate speech transcription in multiple languages for a variety of use cases, including but not limited to customer self-service, agent assistance and speech analytics. Get started fast with our advanced machine learning models out-of-the-box or customize them for your use case.

Features

  • **Automatic speech recognition: **Enable your voice applications using neural technologies for speech recognition powered by IBM Watson.
  • **Model training options: **Improve speech recognition accuracy for your use case with language and acoustic training options.
  • **Optimized for customer care: **Activate your voice application with speech models tuned for the customer care domain.
  • **Pre-trained speech models: **Activate your voice application with speech models tuned for the customer care domain.
  • **Fine-tuning features: **Improve speech recognition accuracy for extracting phrases, words, letters, numbers or lists.
  • **Low latency transcription: **Use our models optimized for low latency in real-time speech applications.
  • **Audio diagnostics before transcription: **Analyze and correct weak audio signals before transcription begins.
  • **Interim transcription before final results: **Improve application response times by using speech transcription as it is generated and throughout the finalization process.
  • **Smart formatting: **Transcribe dates, times, numbers, currency values, email and website addresses in your final transcripts by converting them into conventional forms.
  • **Speaker diarization: **Recognize who said what in a multi-participant voice exchange. Currently optimized for two-way call center conversations but can detect up to 6 different speakers.
  • **Word spotting and filtering: **Filter for specific words or inappropriate content by using our keyword spotting and profanity filtering features. (US English only)