Clean video audio (dialogue)
A dialogue-first workflow: reduce background noise without losing consonant clarity or creating pumping.
What you’re optimizing for
| Goal | What it means | Common mistake |
|---|---|---|
| Intelligibility | Consonants and word onsets remain crisp (speech band ~1–4 kHz stays clean) | Over-denoise smears consonants |
| Stable noise floor | Background reduces without “breathing” between phrases | Hard gating / aggressive thresholds |
| Natural voice tone | Speaker still sounds like themselves | Over-processing creates metallic tone |
Typical video noise sources
- wind (low-frequency bursts, broad turbulence)
- traffic and ambience (non-stationary)
- HVAC and fans (steady broadband)
- camera handling and clothing rustle (transients)
- room echo (reverb tail)
80/20 workflow
- Pick your “dialogue anchor”: find a representative section with speech + noise.
- Fix obvious hum: tonal issues first reduce later artifacts.
- Denoise conservatively: target the noise floor, not the voice.
- Check pumping: listen between words and at sentence ends.
- Deliver a clean file: export a dialogue track ready for your editor.
Dialogue clarity: what to preserve
- Consonant edges: “t/k/p/s/f” drive understanding, especially on phone speakers.
- Breaths: removing all breaths can create unnatural gating and timing issues.
- Room tone: a small stable background can sound more natural than dead silence.
Avoid the big 3 video-audio cleanup failures
1) Pumping between phrases
- Reduce aggressiveness; allow a stable residual room tone.
- Non-stationary ambience won’t vanish completely—optimize for intelligibility.
2) “Metallic” dialogue
- Back off denoise strength or do multiple light passes.
- Ensure sibilance is not treated as hiss/noise.
3) Echo becomes more obvious
- Echo is not background noise. If reverb dominates, improve capture or use dedicated de-reverb.
- Over-denoise can make reverb tails stand out (noise is reduced but echo remains).
Delivery checklist (for editors)
- Consistent loudness: speech level stable across clips (after cleanup).
- No clipped peaks: clipped dialogue stays distorted after denoise.
- Scene changes: if the background changes, treat segments separately.
- Monitor small speakers: phone/laptop speakers reveal intelligibility issues.
FAQ
Can I fully remove crowd noise from video?
Crowd noise is non-stationary and overlaps speech. Aim to reduce masking and make dialogue dominant, not to delete all background without artifacts.
Should I denoise before editing?
Often yes: cleaning early helps you make better editorial decisions. For complex timelines, denoise key dialogue tracks and leave music/effects untouched.