QT reference can have both video and audio.
Since all video is / will be rendered, it's easy to make a simple 'edl' in the video part.
But, the audio must be mixed down (to a new .wav) and the QT ref will link to that.
Now, black in the timeline can be a problem. If you need black, make a mixdown from slug and use that.
Also, mixed codecs can cause trouble.
If you still have problems, export to XDcam MXF as an inbetween.
The beauty is, transcoding to MP4 can start when the export to XDcam is still working.
(So, if the export to XDcam is faster than the transcode to the final product, you can start the two right after each other.)
Bouke
http://www.videotoolshed.com