Basic Soundtrack Requirements for Foreign Distribution
The minimal number of audio tracks to prepare your film for international distribution is two: the DIALOGUE and the M&E (music & environment). The DIALOGUE track is used in ADR (automatic dialog replacement) for voice substitution / translation into a foreign language. The DIALOGUE track should not contain any Foley, incidental sounds, roomtone or effects.
Ideally in scenes where more than one actor are speaking on top of each other, a dedicated DIALOGUE track for each actor should be considered. If you choose this technique, make it a pre-production requirement.
You as the filmmaker may not have control over the ADR or choice of voice actors of the foreign distribution copy. Keep your DIALOGUE track(s) as clean as possible. The M&E track needs to be fully balanced and mastered: it is the canvas that all dialogue fits into.
Make sure that the script is re-edited to be identical to the dialogue spoken in the finished movie. Lipsync is very important, but also realize a word-for-word language translation is not consistantly possible.
Working with Mono, Stereo, Surround Audio

The recommended number of audio tracks for editing is five: DIALOGUE, FOLEY, MUSIC, SFX, BACKGROUND.
Multiple tracks in each catagory offers even more control during the final post-production soundtrack mix.
Most film soundtracks we do here contain 6 to 12 audio tracks rendered in mono and stereo depending on the effect they are supposed to have.
A special note about SFX & BACKGROUND... these are non-academic terms (substitute your own as you like).
BACKGROUND has always been associated with ROOM-TONE; it can also serve as a Foley track.
The key factor is its low volume level in the soundtrack. SFX is a catch-bag catagory.
It can contain the whisper of a reverb tail to a fully sequenced sound design element.
A single stereo soundtrack is the basic standard file format.
Monaural is essentially the same audio on both L&R channels.
Monaural has its own aesthetic and technical qualities.
Stereo audio will fulfill most film festival playback requirements, as well as all television, computers and DVD players.
During the final render of your movie, the stereo-audio can/should be encoded into AC3/Pro-Logic format.
The AC3 advantage is the ability to produce a surround-like playback on a surround listening system: primarily the center-channel is activated and some/limited low frequency is fed to the sub-woofer.
AC3 will also playback normal stereo on a non-surround system. AC3 could also be a full 5.1 mix (similar to DTS).
Producing a full 5.1+ soundtrack can be easy and/or elaborate, from a mix-down perspective. The easy way is to 'expand' the stereo into a 5.1 mix: low frequencies are funneled to the sub-woofer, the center channel is a mid-high frequency combination of the L&R channels, and the rear speakers have a slight reverb applied, lower volume and low-mid frequencies.
Producing a 5.1+ soundtrack that takes full advantage of the surround sound environment should always be a production goal for your movie. True, a surround soundtrack requires a larger number of audio source tracks, additional weeks of post-production commitment, and a reliable surround mixing/playback environment. The advantage is its aesthetic quality to the audience.
My best advice is to keep your audio source tracks as seperated as possible, you will need this larger audio diversity for a 5.1 mix-down. The surround soundtrack can always be created after the stereo version is done. Do a surround soundtrack of the movie trailer to compare listening experiences and work-flow.
Audio Sweetening / Mix / Mastering

Sweetening is correcting and improving soundtrack fundementals: volume matching combined audio tracks, removing unwanted sounds, frequency equalization, eliminating sibilance, background noise reduction, compression, controlling Dialogue and M&E playback dynamics.
Creating a soundtrack is similar to mixing a music track (song). All the audio parts are blended according to their contribution to the song. Low frequencies are enhanced and generally considered omni-directional, therefore they 'sit in the center' of the mix. Rhythm and drums are panned for left & right channels to add clarity to their parts. The lead instrument (dialogue) takes 'center stage' most of the time. Effects, such as reverb, are typically placed in the right, left and/or rear channels to provide an even wider audio field.
Whereas Sweetening is most concerned with the quality of the individual sounds, Mixing is the seamless blending, panning and placement of those sounds into a single audio event. Whether you are working in stereo or surround (let us include mono, also), the goal of a mix is to produce a single output file(s) that has the right amount of volume, dynamics, emotion and energy to make a compeling listening experience.
Mastering is the final, finishing step of the audio production chain. It provides two vital operations that the mix needs: 1) an overall, subtle equalization, compression and loudness 'flavoring'; 2) controlling the relative loudness between songs/scenes. In film, Audio Mastering would be comparable to Color-Timing (tint, gamma, saturation, white/black balance).
These terms are used interchangably according to where and when you were born. The audio processing chain remains the same. Each part, Sweetening, Mix, Mastering, is more of an art than a technical formula. In Mastering, for example, a soft 1dB to 2dB compression will make the overall audio sound louder; a slight boost to the high frquencies starting at 8kHz will give the overall sound an 'airy' appeal. Such subtle changes have a dramatic listening effect. This is the art of sound design.
Setting Correct Volume Levels

Question: in a scene one actor whispers into the ear of another... what is the audience supposed to hear? The answer is not in the script... it is in the dynamics of the scene.
Question: what is the difference between watching a movie on television or in a theater. Answer: the theater does not have a volume knob. (actually it does, but you can't touch it...)
An audience is just as annoyed when they cannot discern what the actors are saying because they are too quiet, as with an ear-shattering loud soundtrack from start to finish.
There needs to be a balance between 'loudness' and 'drama'. The solution is controlling the frequency dynamics as well as the volume arrangement of the soundtrack components.
Volume Dynamics:
Volume is more a factor of human perception than mathematical/technical measurement. Depending on which scientific study you adhere to, for a given sound, doubling its relative volume is increasing it by 6dB (or 10dB).
On a VU meter, 0dB is the loudest possible volume. Anything above 0dB is distrotion. For practical purposes the absolute, drop-dead peak volume of your soundtrack is -0.3dB to -0.1dB. For a maximum, sustained, loud volume -6dB to -3dB is recommended. That is a volume of 2x between 'absolute peak' and 'loud', also called 'headroom'.
In theory silence may be infinite, but for an audience it is -40dB to -50dB. This is where the Room-Tone is placed. Avoid 'dead audio' (as if the speakers were unplugged). This effectively kills the audience's 'suspension of disbelief', unless that is what you want to do.
Dialogue belongs in a range of -16dB (soft) to -3dB (shouting). That is a 3x volume dynamic. Generally the dialogue needs to be heard over anything else in the soundtrack. A rule of thumb is 3dB to 6dB louder. Also, an actors voice/volume between soft & loud needs to be monitored/corrected into a smaller dynamic range. Otherwise the audience will be alternately stunned with the yelling and than reach for the volume knob to understand what else is being said.
Music, sound design, effects and Foley need to support and interact with the dialogue. They nominally work in the -36dB to -6dB, a 6x volume dynamic. Music and sound design take on a special role because they can replace dialogue as the principle soundtrack element. They can be as loud or louder than a dialogue sequence.
In short: Volumes for primary audio content span -36dB to -3dB, with a focal point of -10dB. Peak loudness is -1dB. Maximum silence is -50dB of room-tone. It's a good place to start from.
How-To for Audio Post-Production
Post-Production is a three stage sequence:
First - Visual Edit
Next - Soundtrack Edit, Mix & Mastering
Last - Soundtrack Layback into the final movie
The Visual Edit
1) Edit the movie scenes to a 'beat':
Use a click-track, metronome, temp-music, what ever it takes to give the editing a 'musical' timeline to work from. This is particularly important when editing a scene where there is no source audio (i.e. dialogue). Document your editing.
2) Treat the dialogue with great care:
Do Not Lose SYNC! Save off all dialogue takes for a scene into seperate audio files. You will need these later to replace/repair damaged dialogue elements
3) Make seperate Room-Tone audio files for each scene:
1+ minute of room-tone will help blend different dialogue takes and camera cuts to sound as if they are in the same 'room'. Room-Tone tracks should have been aquired during shooting. If not, the next best thing is to find all the Room-Tone sections for a given scene, combine them and render as one continuous audio file.
4) Some dialogue cannot be fully repaired:
Bad dialogue comes in may forms: poor mic placement, soft-spoken actor, room noises that are too loud, background sounds that periodically interfer. If damaged dialogue is beyond hope, you either replace it (ADR) or cut the scene to hide/eliminate it. Using audio noise-reduction software can get good results, but it cannot fix everything.
5) When the visual editing is completed, 'Lock' the movie footage:
No further film/video editing that would affect the timeline or lipsync is allowed. Do not re-edit a scene once it is committed to audo post-production. Be aware that doing so will jeopardize, or delay, the music, sfx and Foley already completed for the original scene.
Important: Burn a time-code onto the Locked movie, starting at 01:00:00;00 (H:M:S;F).
This is now the authoritative reference for all audio editing, scene matching & sync.
6) Preperation to hand the project over to Audio Post-Production:
* The 'Locked' movie footage with time-code - a preview quality render is usually okay
* The dialogue as seperate files, sync'ed to the Locked movie footage
* The alternate dialogue takes (files) to assist in dialogue repair, as appropriate
* The Room-Tone files for each scene, as appropriate
* Any practical Foley elements (files) that are unique to the movie action and place
If the Locked movie footage is 90 min, 10 sec and 13 frames, than export all audio files with the same, exact time length, including any silent/muted sections.
The timecoded reference footage must match your movie's exact format: 29.97fps | 23.976fps (2-3-pulldown) | 60fps | etc.
Export audio tracks to an uncompressed, lossless file format, typically .AIF or .WAV at 48kHz, 24 bit. The bare minimum format is 44.1 kHz, 16 bit (CD quality).
The Soundtrack Edit & Mix
7) Give a clear description/overview of the soundtrack dynamics:
Your movie should have some silent sections (i.e. room-tone). They are equally as important as any loud action sounds or scene-filling music. Have differing styles of music for scene and charater development, as well as the overall movie genre or theme.
8) Editing the soundtrack is just as involved as editing the visuals:
The soundtrack can make or break a movie. Provide it with the same consideration and time to develop as was given to the visual editing stage.
9) Be prepared to make some tough decisions:
The soundtrack can be anything you want it to be. You as filmmaker/director are the final arbitor as to what is included in the soundtrack, where and how much. Ask for the best recommendations from the audio-post team. Finally, make your decision and go with it.
10) Preperation to hand the project over to Audio Layback:
* The 'Locked' movie footage - with the mastered stereo soundtrack as reference
* The final Dialogue, Foley, Music, SFX and Background tracks as seperate files
* The final mastered Dialogue and M&E tracks as seperate files
* The final mastered stereo soundtrack
* The final mastered mono sountrack, as appropriate
* The final mastered surround soundtrack, as appropriate
The Soundtrack Layback
11) Performing Audio Layback:
The soundtrack will return your editing workstation in mutliple sync'ed files. Use the Locked movie footage as a reference. Use your choice of the mastered audio files for direct Layback into your timeline. Keep the other files for future re-editing and/or re-mastering (i.e. surround mix).
Since the audio files are the exact length of the Locked movie, they will match the final movie footage & keep in sync from start to finish.
12) Render the final movie, and listen carefully:
If you do not have a 'cinematic' or 'near field monitor' playback system hooked up to your editing workstation, use quality headphones. Also view/analyze the final movie on more than one playback device: home-theater, portable DVD player, different computers, etc.
The mastered files are volume adjusted to match from scene to scene. You can change any volume levels during preparation for the final render but do so with great care: validate your playback changes on several systems before ordering 1000 DVD copies from your replicator.
13) What to do if you really need to re-edit a Locked scene:
Re-editing a scene at this point can be fairly straight forward decision...
A) if the dialoge does not lose lipsync and the sfx/Foley/music remain sync'ed to the action, you can make 'visual/cosmetic' changes as needed. Be aware that a seemingly simple edit, such as switching scene order, can cause havoc/mismatch on the music continuity, for example. This can be very apperent during scene segues.
B) go back to the audio-post team with the newly re-edited scene and order-up a set of replacement soundtrack files for it;
C) use the multiple (un-mastered) audio tracks to create a new soundtrack for the scene. Remember this scene replacement soundtrack mix is not Mastered. That scene could sound 'different' then the rest of the movie. The fix is to re-master that new soundtrack to match in/out segues, volume and aesthetic continuity.
D) the toughest decision of all: does the scene really need the changes you have in mind or are you obsessing on a minor detail. Remember: You made great decisions in order to get your movie to this final release state. You as filmmaker/director will always notice 'flaws' that no one else will see.
Summary
Your soundtrack informs the audience about emotions, pace, actions, tenderness, danger, thrills, suspense... Put as much effort into your soundtrack as you did when visually editing the movie. The soundtrack you use can make or break your movie.
Frequency Dynamics:

Frequency Dynamics, like Volume. is more a factor of human perception than mathematical/technical measurement.
Compare male & female human voices They use two seperate frequency, tonal, ranges. What is naturally observed is that high frequency will appear louder than low frequencies with both set at the same volume level. High frequencies are more directional in a stereo field. Low frequncey sounds seem to eminate from all directions and are felt more then they are heard.
What is sonically happening during your dialogue can-not/should-not come down in volume whenever an actor speaks, or, takes a breath. This is an extreme example of 'volume-ducking'. Unless it is an effect you are after, it sounds mechanically un-natural. The same principle applied to frequency-ducking has a natural sounding balance between dialogue and what's around it. The static version of this is "cutting a frequency hole" into the supporting audio content. Again if frequency-ducking is pushed to the extreme an 'audio phasing' and other artifacts appear.
Think of your soundtrack as one long song. Dialogue, Foley, Music, Sound Design (and even Room-Tone) need to play in the same frequency space and still deliver their distinct qualities.
The sub-woofer effect, i.e. too much bass for too much time can be detrimental. It's quality of boosting the 'gut' emotions can also obscure the audio content trying to play alongside it. Music/sound is energy. Bass sounds require the most energy in what is a finite audio spectrum. Unless your movie is a music video, use very low bass frquencies to punctuate, rather that dominate the rest of your soundtrack.