Since I don't have Audition yet, I can't tell you the specifics, but I can do this in Soundbooth, so it should be a no brainer in Audition (or Sound Forge, or just about any decent audio package).
It's called "sampling the noise floor". Find a spot where the talent isn't speaking (just bgd noise). Select that. There should be a command to sample that noise (noise print in Soundbooth, I think). Once you sample the noise, there should be a command to remove that sample from the whole clip. Done.
What Joseph says works great for a low hum or buzz, but for a crowd sound, I would expect it would not be acceptable. Once you remove the sound of the crowd, most of the sound of the speaker will be gone as well, leaving you with a very metallic, hollow-sounding voice. Of course you can try it, but I wouldn't expect miracles.
In Audition it's under "Effect", then "Noise Reduction". As Joseph says, you first select an area of JUST the noise you want removed and select "Set Noise Print", then go to the "Remove Noise" (or whatever it's called) function.
I think Rob has a good point here. With a low hum or buzz, you're dealing with a fairly limited chunk of the sound spectrum, and with crowd noise, you may be getting parts of the noise which are in the same range as the speaker's voice. The good news is that he has a deep, resonant voice, so he sounds a bit stronger in the bass end than most of the background voices.
Try both, since they are very quick to accomplish, and let us know your results.
It's going to be a bit more difficult, but one approach would be to use a combination of expansion and compression. I downloaded the audio and took a quick shot at it.
Here's what I did:
1. Normalized the file to -9 (so I had some headroom to work with)
2. Used the Dynamics Processing plugin in Audition 3 to setup an expander and a compressor
a. The expander was set with a ratio of 3:1 for anything above -17.
b. The compressor was set with a ratio of 2:1 for anything below -20.
The initial file (after normalization) and the end result are posted on soundcloud - http://snd.sc/pYiZp5
The example is pretty "pumpy" so the attack and release characteristics are obviously off, and I'm sure the thresholds for both the compressor and expander aren't dialed in either.
@Jesse, The process you described seems like it work well for getting the speaker's "silence" quieter, but still leaves all the noise with the voice of the speaker. As I understand Justin's problem, he wants to separate the speaker's voice from the noise. Expansion and compression would not actually accomplish this.
@Rob n Joseph, I agree that a crowd noise would need more finesse and tlc to get it to sound good, but I'm not so sure the crowd noise itself would be the problem. In my experience, ANY noise that is too loud leaves the voice sounding metallic and thin. Even room noise and hum.
The key is not the spectrum so much as pattern. Noise reduction works not my removing only selected frequencies (anyone could easily select a frequency and reduce it or delete it), but by analyzing a frequency pattern and removing only what matches that pattern. That is why a voice can be preserved when removing a generic room hiss.
When applied to a crowd noise, depending on the crowd size, the frequencies will likely be all over the spectrum. However, the pattern (or in this case the muddle) may be able to be isolated and removed. Again, it will take some work, but play around with it and let us know what happens!
@Rob, I knew a large family of Neidigs in Indiana. Do you have family there?
I don't know of any relatives in Indiana, though the Neidig name is rare enough that it's certainly possible there's a connection somewhere. My Neidig relatives were in New Jersey and Pennsylvania (after coming from Germany, that is). I grew up in Kentucky, then lived in the Chicago area through high school and college, so we've got Indiana almost surrounded, but I don't know of a direct connection there.