The "blur map" is a greyscale image, where the brightness of the image varies with the distance of the thing in the shot from the camera. Usually these are generated from a 3d app, like Cinema 4D, to allow you to add camera blur to computer-generated footage. Using one image to influence the intensity of another filter is sometimes called a "compound effect."
Since you are working with real-world footage rather than CG renders, you will need to do some rotoscoping (as already suggested). It is possible to generate a depth mask (a custom version of the greyscale image that a 3D app can generate) but in most cases it isn't worth the trouble -- separating elements with duplicated footage and animated masks, then blurring each layer the desired amount can often do the trick. And it might be less work (though it will still be a lot of work!); good luck!
Ben Unguren
Motion Graphics & Editing
http://www.mostlydocumentary.com