Encoding specifications for music videos

The features described in this article are available only to partners who use YouTube's Content ID matching system.

The media files that you deliver to YouTube for music videos must conform to these specifications.

Audio Profile

Attribute Specification
Codec FLAC
Linear PCM
Sample rate 44.1 kHz recommended. Higher sample rates are accepted but not required (for example, 48 kHz or 96 kHz).
Bit depth 24-bit recommended, 16-bit acceptable
Channels 2 (stereo)

Although it is not recommended, YouTube accepts compressed audio. YouTube transcodes from the delivered format; audio quality is much better when transcoded from a lossless format compared to re-compressing a lossy audio format.

If you must deliver compressed audio, use these specifications:

  • Codec: AAC-LC
  • Sample rate: 44.1 Khz
  • Bit rate: 320 kbps or higher for 2 channels (higher is always better; 256 kbps acceptable)
  • Channels: 2 (stereo)
Video option 1: H.264 Codec

Video Profile

Attribute Specification
Containers .mp4
.mov
Codec H.264
Profile High
Frame rate 23.98, 24, 25, 29.97, 30
Bitrate SD (fewer than 720 lines) – 15 Mbps
720 lines – 50 Mbps
1080 lines – 60 Mbps
Resolutions 1.33 (4:3) – 720x480, 1440x1080, 720x576 (PAL)
1.78 (16:9) – 720x404, 720x576 (PAL 16:9), 854x480, 1280x720, 1920x1080
(cropped content that is wider than 1.78 may deviate from these resolutions – see notes about matting for non-standard aspect ratios)
Pixel Aspect Ratio SD: Anamorphic (non-square) pixels will be accepted, but Pixel Aspect Ratio (pasp) flag must be set to 16:9 or 4:3.
HD: Square pixels only (no anamorphic content)
Scan type Progressive
  Native frame rate content should be deinterlaced.
  Telecined content should be inverse telecined to original frame rate.


Note: Content with blended frames or interlacing artefacts will be rejected.
GOP Structure IBBP (M=3, GOP Length not to exceed ½ of the frame rate)
Colour Space 4:2:2 (preferred)
4:2:0
Matting 16:9 frame size delivered with letterboxing will be accepted. If content contains pillarboxing (black on left and right), windowboxing (black on all sides) or is 4:3 LTBX, content should be cropped to active pixel area only.
Notes Edit lists are not allowed as these cause loss of A/V sync.
moov atom must be present and at the front of the file.
Video option 2: MPEG-2 Transport Stream

Video Profile

Attribute Specification
Containers MPEG-2 Transport Stream (.mpg, .mpeg, .ts)
Codec MPEG-2
Profile SD: Main@Main
HD: 422@High
Frame rate 23.98, 24, 25, 29.97, 30
Bitrate SD (fewer than 720 lines): 50 Mbps
HD (720 lines or higher): 80 Mbps
Resolutions 1.33 (4:3) – 720x480, 720x576 (PAL only), 1440x1080
1.78 (16:9) – 720x404, 720x576 (PAL 16:9 only with anamorphic flag set), 854x480, 1280x720, 1920x1080
(cropped content that is wider than 1.78 may deviate from these resolutions – see notes about matting for non-standard aspect ratios)
Pixel Aspect Ratio Square pixels only (no anamorphic content).
Scan type Progressive
 Native frame rate content should be deinterlaced.
 Telecined content should be inverse telecined to original frame rate.

Note: Content with blended frames or interlacing artefacts will be rejected.
GOP Structure IBBP (M=3, GOP Length not to exceed ½ of the frame rate)
Colour Space 4:2:2 (preferred)
If 4:2:2 colour space is not available, please use 4:2:0.
Matting 16:9 frame size delivered with letterboxing will be accepted. If content contains pillarboxing (black on left and right), windowboxing (black on all sides) or is 4:3 LTBX, content should be cropped to active pixel area only.
Was this helpful?
How can we improve it?