Hi guys. I sent ten people a karaoke track, and asked them to listen to it with headphones and make a video of themselves sing along. I am trying to sync up those recordings to make everyone singing in unison. The problem is that everyone is doing it a little differently - some notes are late, some are early, etc. Even if I make every recording start at the same point, it ends up sounding sloppy.
I think the solution here is the Automatic Speech Alignment feature? I create a multitrack file, pick whatever recording is the most spot-on, and then sync up every other track one at a time. The problem with this is that I assume it's not going to time remap the video to match, right? So I'll have an audio clip where everything is synced up but nobody's audio will exactly match their mouths anymore (well except for whoever I used as the master track). Anyway around this? I can probably cope by only showing people for a couple seconds at a time and manually adjusting those video clips to go along with the combined audio. But it would be ideal if Audition somehow let me align audio clips and the corresponding video at the same time.
If everything is recorded on one track, it makes it more difficult because you have the music (which will be in sync) and the singing (which will not be perfectly in sync) all together. You can use the center channel extractor to varying degrees of success to isolate the vocals so that you can then perform manual adjustments. Small clips, like you suggested, would be the way to go. It probably won't be perfect, but in the end people expect karaoke to not be perfect.
Video Producer / Digital Marketer / Gear Reviewer / Author
-- http://www.AndyFordVideo.com --