Common accuracy issues and fixes
Names and technical terms are misspelled
Names and technical terms are misspelled
Cause: The transcription model does not know your company-specific terms, product names, or people’s names.Fix: Add terms to your custom vocabulary:
- Go to Settings > Transcription > Custom vocabulary.
- Add names, acronyms, product names, and technical jargon.
- Future transcriptions will prioritize these terms.
Wrong language detected
Wrong language detected
Cause: Automatic language detection misidentified the spoken language, often when meetings include code-switching or technical English mixed with another language.Fix: Set a default language:
- Go to Settings > Transcription > Default language.
- Select the primary language of your meetings.
- Mavio will still detect language switches but will default to your selection when uncertain.
Speakers talking over each other (crosstalk)
Speakers talking over each other (crosstalk)
Cause: Overlapping speech is inherently difficult for any transcription system. When two people talk simultaneously, words get lost.Fix: There is no technical fix for crosstalk after the fact. To prevent it:
- Encourage “raise hand” behavior in meetings
- Use the meeting bot (which captures cleaner audio from the platform) rather than mobile recording
- Edit the transcript manually for critical sections
Strong accents or non-native speakers
Strong accents or non-native speakers
Cause: While Mavio supports diverse accents well, very strong accents combined with fast speech can reduce accuracy.Fix:
- Ensure audio quality is high (headset, quiet environment)
- Add commonly mispronounced terms to custom vocabulary
- Use cloud mode rather than privacy mode for better accent handling
Background noise affecting accuracy
Background noise affecting accuracy
Cause: Cafe noise, air conditioning, traffic, and other ambient sounds compete with speech.Fix:
- Use a headset or close-talking microphone
- Record in a quiet room when possible
- Use the meeting bot or system audio capture (which bypass the microphone entirely)
- Enable High quality recording mode for better noise handling
Phone or dial-in audio is garbled
Phone or dial-in audio is garbled
Cause: Phone lines transmit audio at 8 kHz (narrowband), which is half the quality of computer-based audio (16 kHz+). This reduces the detail available for transcription.Fix: Ask participants to join via computer audio instead of dialing in. If phone dial-in is unavoidable, the accuracy reduction is expected and cannot be fully compensated.
Factors that affect accuracy
| Factor | Impact | Your control |
|---|---|---|
| Audio quality | High | Use headsets, quiet rooms |
| Speaker clarity | High | Encourage clear speech |
| Background noise | High | Control environment |
| Number of speakers | Medium | Fewer simultaneous speakers helps |
| Speaking speed | Medium | Normal pace is best |
| Technical jargon | Medium | Custom vocabulary |
| Accent strength | Low-Medium | Custom vocabulary helps |
| Language | Low | Most languages well-supported |
| Recording method | Medium | Bot and system audio are cleanest |
Improving accuracy over time
Custom vocabulary
The single most impactful improvement for domain-specific accuracy. Add:- People’s names (especially unusual spellings)
- Company and product names
- Acronyms and abbreviations
- Industry-specific terms
Speaker corrections
When you correct speaker labels, Mavio improves future voice matching. This indirectly improves transcription because correctly attributed audio segments can be processed with speaker-specific acoustic models.Transcript edits
Edited transcript segments help Mavio learn which terms are important in your context. While Mavio does not retrain models on individual edits, the system uses edit patterns to inform future vocabulary weighting.Reprocessing a transcript
If you have added custom vocabulary or believe a transcript could be improved:- Open the meeting.
- Click the three-dot menu > Reprocess transcript.
- The transcript is regenerated using the latest models and your custom vocabulary.
- Reprocessing takes 2-5 minutes depending on recording length.
Reprocessing replaces the existing transcript. Any manual edits you made will be lost. Export or copy your edits before reprocessing if you want to preserve them.
When to expect lower accuracy
Some scenarios inherently produce lower accuracy. This is expected and not a bug:- Large group brainstorms with frequent crosstalk (many voices overlapping)
- Recordings in noisy outdoor environments (wind, traffic)
- Very long meetings (3+ hours) — fatigue in speakers leads to less clear speech
- Low-bandwidth phone dial-ins — narrowband audio limits accuracy
- Privacy mode — on-device models are slightly less accurate than cloud models