What's new

System-wide app for EQ, deverb, and noise removal for transcription of spoken audio?

Kevin Fortin

Active Member
Hello, all

Recently I have been looking into online transcription as a way to make some extra pocket money. (It's not a very efficient way.)

It strikes me that it would be useful to have a system-wide app (for Windows in my case) that could help with realtime deverb and noise removal.

Unfortunately, on the transcription site I am familiar with, we are unable to download the source audio to do anything in Audacity, for instance. This is understandable for privacy reasons.

I know FXsound is already well-regarded by some as a system-wide EQ.

Any other suggestions for audio-processing apps that I could try?
 
Last edited:
I think I am on the trail of a solution: using something like Virtual Audio Cable with Minihost Modular, an EQ plugin, and a plugin such as SPL's D-Verb. I'll fiddle with this sometime and see how it goes.
 
Last edited:
Yeah, that’s what I’d recommend. Virtual Audio Cable will let you patch the audio into your DAW and there you can run whatever plugins you want on it.

Beware of latency. The VAC loopback will introduce a small amount of latency, and most of the good plugins that do what you want have a lot of latency.
 
Got it working!

I had bought a license for Sonible's frei:raum when it was on sale a couple years ago, and that's the only plugin I need in Minihost. It does a great job of cutting down on the boominess and ambient noise.

This really helps to clarify the vocal content of noisy audio files. I don't need to clean them up for broadcast or export, just make them understandable for transcription.
 
  • Like
Reactions: bun
Is it really quicker to transcribe... or potentially just use voice to text (done without your supervision) and then just go thru and fix the errors?
 
Is it really quicker to transcribe... or potentially just use voice to text (done without your supervision) and then just go thru and fix the errors?
Good question! I would have to say, it depends. Auto-recognition does seem to be the trend for medical transcription as well as server-side on general transcription sites.

However, some files have enough reverb and ambient noise that human pattern-recognition/sifting still has a role to play. Also, from my own experience and the reports of others, it's often less efficient to correct automatic V2T output than to just type it from scratch into the expected format.
 
Last edited:
iZotope RX?
Thanks for the suggestion. I had pretty good results the other day with the deverb and denoiser from Accusonus (processing in realtime, and hosted in Image Line's Minihost Modular, which had the signal routed to it using VB Audio's Virtual Audio Cable).

But I should mention this transcribing idea has moved to the back burner for now. It's not a very efficient way to make money (at least for me).
 
Last edited:
RX is expensive but it’s the gold standard. Worth the investment, especially if you plan to build your own sample libraries.
 
Top Bottom