Does anyone know if it's typical for the firms that provide live captioning services at events, to also offer .webvtt files at the end of those events allowing you to subtitle the event video recordings?
I do video production at a nonprofit that's trying to cut costs at its upcoming annual meeting, and hopes to get both these tasks done by one firm (we previously used 2 separate firms for these tasks).
Just let me know if anyone knows anything on this, or has any direction there. And please let me know if this is the wrong forum for this question.
I think it's typical that the live captioners add a modest upcharge to convert their transcriptions to VTT.
If you're really hurting for budget, a trick I use for shorter projects is to upload the video to an unlisted youtube and let their AI take the first crack at it. From their editing interface (when you have a creator studio account) you can edit and correct the mistakes the AI made and then download an SRT or VTT file. The quality of the transcription depends a lot on the clearness of the audio and a lack of accents or dialect. One good clear voice at a time, in a clean room, well-recorded, comes out of the box about 90 percent correct. Off-mic, echo-ey rooms and people speaking too fast or with strong accents - you get 50 t0 60 percent accurate, and have to clean the rest up manually. Youtube's AI doesn't do punctuation or capitalization either, nor will it assign name headers as each new voice chimes in on the track... But hey, it's a start and better than nothing. Those extra features are why live human transcriptionists are still worth what they want to charge.
There's some other free transcription systems out there, I don't think they do any better of a job though, and they may not come with an easy editing interface like Youtube has.