What you're talking about is as old as Winsor McKay and his animated dinosaur Gertie.
That is *all* about timing, if the animation backing is pre-recorded. Here in Springfield, the Lincoln Presidential Museum has an interactive show where live actors interact with extensive animation, projected in a room that's like a giant teleprompter head ( AKA the
Pepper's Ghost illusion). Multiple actors take shifts playing the live host, they all pantomime and lip synch to the same pre-recorded master audio track. The track is the guide that keeps everyting in synch, both virtual and practical.
http://www.lincolnlibraryandmuseum.com/ghosts.htm
In your project you might use this principle by pre-recording the audio and cutting your keyable animation to it, then having the presenter lip synch to it.
A more modern way to go would be digital puppetry, using MIDI controllers to operate props or characters in real time, but this is horrendously complicated and so, expensive to do. PBS and Henson's studios are currently doing this sort of thing with 3d-CGI characters for a kid's show. That's very cutting-edge.
You could reverse-engineer the whole thing as well, I suppose, by not making the animation until the actor plate has been shot and stabilized, then you can use tracking software to create motion paths for virtual objects, or just brute-force keyframe the movements and positions to hit specific marks. Depends on your skills and budget.
One last suggestion; have one or two practical (i.e.real) props or set items in the shot if possible, not just to give the actors more landmarks to base their performance around, but also to add a middle distance of things between the foreground actor and the fake backdrop, thus blending the boundaries to help sell the illusion. If a warehouse scene, a few real boxes stacked here and there on the set and a hand truck, etc. or a fake pillar, real hanging industrial light fixture, anything like that, can help.