Technology
February 19, 2024
6 min

Crowdlinker’s Top 8 AI Transcription Tools Recommendations

Introduction

As discussed in our blog post on Navigating the Maze: A Strategic Approach to Selecting the Ideal AI Transcription Tool, there are several criteria you need to consider when choosing an AI Transcription Tool. Each of these criteria will have a different weight based on your use cases and priorities. 

The Analysis

At a high level, all of these tools & companies offer the same capabilities: transcriptions that are customizable, scalable, highly accurate, based on cutting-edge natural language processing technology, and can be deployed on their servers or self-hosted. But, when you look under the hood, there are many criteria that can help differentiate them and bring you closer to making the right decision. To help you choose, we decided to make our analysis of the following tools public: OpenAI Whisper, Azure Speech to Text, Amazon Web Services (AWS) Transcribe, Deepgram, Google Cloud Universal Speech to Text, IBM Watson, Speechmatics and Assembly AI.






OpenAI Whisper
Pricing: $0.006/min (One-time)
Automatic Data Redaction: No
Fine Tuning: Yes
Commitment Required: No
HIPAA Compliance: Must sign BAA
File Size Limit: 25 MB
Speed (For 11 min Audio File): ~35 seconds
Real-Time Transcription: Yes
Multilingual Support: Yes
Can Be Deployed In Canada (Data Residency): Will have to contact for this information. They say they support Canada, but doesn’t mention if processing happens in Canada only.




Azure Speech to Text
Pricing: $1/hr or $0.0166/min (Real-time) $0.36/hr or $0.006/min (Batch)
Automatic Data Redaction: Yes. Text-only through Language service, and will incur extra charges. For PII limitations, see here.
Fine Tuning: Yes, through Custom Speech Model
Commitment Required: Will help reduce price but not required
HIPAA Compliance: Must sign BAA
File Size Limit: 1 GB
Speed (For 11 min Audio File): 3 min
Real-Time Transcription: Yes
Multilingual Support: Yes
Can Be Deployed In Canada (Data Residency): Yes



AWS Transcribe
Pricing: $0.006/min (One-time)
Automatic Data Redaction: No
Fine Tuning: Yes
Commitment Required: Will help reduce price, but not required
HIPAA Compliance: Must sign BAA
File Size Limit: 2 GB
Speed (For 11 min Audio File): 1 min
Real-Time Transcription: Yes
Multilingual Support: Yes
Can Be Deployed In Canada (Data Residency): Yes





Deepgram
Pricing: $0.006/min (One-time)
Automatic Data Redaction: No
Fine Tuning: Yes
Commitment Required: $10,000/year
HIPAA Compliance: Must sign BAA
File Size Limit: 2 GB
Speed (For 11 min Audio File): 10-15 seconds
Real-Time Transcription: Yes
Multilingual Support: Yes
Can Be Deployed In Canada (Data Residency): The model will have to be uploaded to a server. This can lead to higher monthly DevOps costs




Google Cloud Universal
Pricing: $0.006/min (One-time)
Automatic Data Redaction: No
Fine Tuning: Yes
Commitment Required: No
HIPAA Compliance: Must sign BAA
File Size Limit: 10 MB
Speed (For 11 min Audio File): Not Available
Real-Time Transcription: Yes
Multilingual Support: Yes
Can Be Deployed In Canada (Data Residency): Yes




IBM Watson
Pricing: $0.006/min (One-time)
Automatic Data Redaction: No
Fine Tuning: Yes
Commitment Required: Requires signing up on a premium plan
HIPAA Compliance: Must sign BAA and be on a Premium Plan
File Size Limit: 100 MB
Speed (For 11 min Audio File): 7 minutes
Real-Time Transcription: Yes
Multilingual Support: Yes (in beta mode)
Can Be Deployed In Canada (Data Residency): No





Speechmatics
Pricing: $0.006/min (One-time)
Automatic Data Redaction: No
Fine Tuning: Yes
Commitment Required: No
HIPAA Compliance: Must sign BAA
File Size Limit: 1 GB
Speed (For 11 min Audio File): 1.25 min
Real-Time Transcription: Yes
Multilingual Support: Yes
Can Be Deployed In Canada (Data Residency): The model will have to be uploaded to a server. This can lead to higher monthly DevOps costs.




Assembly AI
Pricing: $0.006/min (One-time)
Automatic Data Redaction: No
Fine Tuning: Yes
Commitment Required: 2,000 to 3,000 hours a month. This amounts to $20k - $36k per year.
HIPAA Compliance: Must sign BAA
File Size Limit 1 GB
Speed (For 11 min Audio File): 2 min
Real-Time Transcription: Yes
Multilingual Support: Yes
Can Be Deployed In Canada (Data Residency): No, but on the roadmap for Q1 2024

Disclaimer: Please note that this analysis was completed in December 2023. Some of these parameters may have evolved since then.

Models Specific to Technical Fields:

Several tools, such as Deepgram, AWS Transcribe and GCP Universal Speech to Text offer specialized models catering to technical fields. While these models may incur additional costs, the heightened accuracy they provide is a crucial factor for industries like healthcare and law. When considering these specialized models, always refer to the fundamental criteria of error rates and hallucination rates to ensure alignment with your specific use case.

Read more
Community posts

GET IN TOUCH
GET IN TOUCH

Want to learn more?

Let’s start collaborating on your most complex business problems, today.