Make letters in a 3D program like blender of cinema4d. Add matching lighting and shadow. Then composite the letters with tracking on the live video. In the case of a person going in front of the letters, you'll need to rotoscope.
In Cinema4d and Maya you can use images for lighting (probably in blender as well)
If you have shots from various angles of the same frame that will help get it hyper-real
So you take the frame you want to place the 3d text in, create a plane or sphere and texture it with the image from that frame and make it cast onto the 3d object with luminance and reflection. Make sure to use ambient occlusion to really pull off the effect (you will need to put a flat plane in to simulate what the text is sitting on) so you get the shadows between the text and the surrounding objects.
Then render it out with straight alpha and pull into motion or after effects and composite it into the scene.