speech recognition / automated transcribing
by Nis Boye
on
Aug 7, 2008 at 9:03:09 am
I'm looking for some way to automate the transcribing process. Having a transcript of an interview always helps the editing, but it's so timeconsuming, that I rarely do it.
Avid has a feature that's the opposite of what I'm looking for: "script-sync" syncs your video to already typed scripts (http://www.avid.com/scriptsync/scriptsync.asp).
There are also apps like "Dictate" from www.macspeech.com that has speech recognition, but that seems to work in realtime - and would not give me timecodes.
I'm looking for an app that can chew through video and audio files as fast as possible and give me transcript with timecodes.
Re: speech recognition / automated transcribing by Mark Raudonis on Aug 7, 2008 at 1:16:30 pm
Even though the super secret government programs can pull a "hot" word off of an intercepted cell phone call, to my knowledge, there is no similar program available commercially. There are plenty of "talk to text" programs out there, but the all rely upon "training" the program with your own voice to achieve acceptable accuracy levels. Punctuation is a problem as well. The "dictator" must insert phrases like "period", "Comma", or "New line", to create a readable looking text.
I've seen these programs used by a "shadow" dictator, meaning a person wearing headphones listens to a taped interview and simultaneously speaks into the transcription program. This works pretty well with someone trained to do it, achieving "near real time" rates.
Bottom line, what you're seeking doesn't exist yet. We spend a lot of money on transcripts every year, and when I find a better way, I'll be the first to use it.
Re: speech recognition / automated transcribing by David Roth Weiss on Aug 7, 2008 at 3:52:47 pm
[Mark Raudonis]"Bottom line, what you're seeking doesn't exist yet."
Actually it does exist Mark and you can even try it out now.
If you go to the Adobe Labs website you can download the beta version of SoundBooth, which has this feature. That being said, though it's probably the most interesting development for doco and reality creators to come down the pike in a long time, and though it holds great promise, let's just say, it isn't perfect just yet.
I tested the beta several times and I found its accuracy to be far from desirable. It is in beta, and so hopefully that will change when the release version hits the market later this year.
A reviewer from Macworld had somewhat similar findings and wrote:
"The unedited results are not the kind of thing you’ll want to post as a transcript of your audio, as mistakes abound."
"In its current form, Soundbooth’s transcribing feature is far from accurate."
"Even if the transcription is only 50-percent accurate (and those are the kind of results I saw in initial testing), it’s likely that you’ll find enough of the text you’re after to easily navigate through the file."
Personally, I did not find the reviewer's 50% accuracy concept was valid for me. I was simply unable to make use of the results. If you test it, please let me know what you find.
David
David Roth Weiss
Director/Editor
David Weiss Productions, Inc.
Los Angeles
POST-PRODUCTION WITHOUT THE USUAL INSANITY ™
A forum host of Creative COW's Apple Final Cut Pro, Business & Marketing, and Indie Film & Documentary forums.
Re: speech recognition / automated transcribing by Dylan Reeve on Aug 8, 2008 at 2:17:47 am
Considering how many interviews I've edited where I've had to stop and replay bits over and over again to even be sure myself what it said, it's hard to imagine that this sort of thing will ever be more that 'somewhat' accurate. When your throw the wide variety of accents and speech variations into the mix it's even more boggling.
Avid's ScriptSync feature is amazing, but then it's probably a lot easier that way around, you generate a phonetic map of the transcription or script and then match that to the audio. It's great once you have a transcript, but that tough bit is still required.
One thing I know many ScriptSync users are doing it partial transcriptions. They make a transcript of only some of the tape, like a few sentences a minute - the most interesting ones - and then ScriptSync can still match to those. By adding general comments about the subject in discussion the partial transcript becomes useful without having to be 100% complete.
I recall reading a few weeks ago that there was an FCP plugin in the works that would perform similarly to ScriptSync, so perhaps that idea holds some practical hope?
Re: speech recognition / automated transcribing by walter biscardi on Aug 7, 2008 at 1:26:46 pm
[Nis Boye]"I'm looking for an app that can chew through video and audio files as fast as possible and give me transcript with timecodes.
Anybody seen something like that? "
Nothing like that, but we do work with a great transcriptionist who works directly with timecodes from Quicktime movies. She can turn around a 30 minute tape in one day easily that way.
Walter Biscardi, Jr.
Biscardi Creative Media HD and SD Production for Broadcast and Independent Productions.
Re: speech recognition / automated transcribing by Bob Cole on Aug 8, 2008 at 3:31:46 pm
DRW turned me onto Production Transcripts, about $2/minute.
re: The technique of listening and simultaneously talking into a speech recognition program. I have tried "Naturally Speaking" Nuance/Dragon software, ver. 9. I wasn't thrilled with the results from version 9 but I remain hopeful. David Pogue gave the update very high marks recently.
MacPro 2 x 3GHz dualcore; 10 GB 667MHz
Kona LHe
Sony HDV Z1
Sony HDV M25U
HD-Connect MI
Betacam UVW1800
DVCPro AJ-D650
Re: speech recognition / automated transcribing by Nis Boye on Aug 13, 2008 at 6:56:47 pm
hmmm... I hate it when technology is behind my needs.
I wrote MacSpeech, who makes Dictate, and here is there reply:
"It is something we hope to offer in a future update, but I honestly do not have a time frame on when that feature will be released... ...Just a note, even when we do release that feature, the software would only be able to know your voice in an interview with any degree of accuracy. Unless you train your subjects as well, the accuracy rates will fall a bit with an unknown voice. Just something to keep in mind going forward."
what about from a file by Bob Cole on Aug 13, 2008 at 8:10:01 pm
It would be great if someone could develop software that would transcribe from a Quicktime file. It seems to me that a file would offer a lot more time for software to use contextual cues to improve word recognition. Rather than computing "on the fly" as from a live voice, file-based voice recognition could use as much time as necessary to translate sounds into words.
Just a dream? or is there someone out there who is working on file-based voice recognition?
MacPro 2 x 3GHz dualcore; 10 GB 667MHz
Kona LHe
Sony HDV Z1
Sony HDV M25U
HD-Connect MI
Betacam UVW1800
DVCPro AJ-D650
Re: what about from a file by David Roth Weiss on Aug 13, 2008 at 8:25:18 pm
Bob, you must have missed my post above in this thread. Adobe SoundBooth has a transcription feature using just about any video file, i.e. Quicktime files. But, as I explained, the accuracy is not great...
David Roth Weiss
Director/Editor
David Weiss Productions, Inc.
Los Angeles
POST-PRODUCTION WITHOUT THE USUAL INSANITY ™
A forum host of Creative COW's Apple Final Cut Pro, Business & Marketing, and Indie Film & Documentary forums.
mea culpa by Bob Cole on Aug 14, 2008 at 1:24:49 am
Thanks David. Actually I did read your post, and I should have mentioned that. Too bad Soundbooth doesn't work well; the concept is terrific. It's worth tracking to see whether it improves.
Bob
MacPro 2 x 3GHz dualcore; 10 GB 667MHz
Kona LHe
Sony HDV Z1
Sony HDV M25U
HD-Connect MI
Betacam UVW1800
DVCPro AJ-D650