Use spatial audio in 360-degree and VR videos

You can experience a video's sound in all directions, just like real life, with YouTube spatial audio. Use it to take your spherical (360° and virtual reality) videos to the next level so that viewers can immerse themselves in your content.
You can only use spatial audio for 360-degree and virtual reality (VR) videos.

Learn how to upload 360-degree videos and virtual reality videos on YouTube.

Spatial audio listening experience

YouTube supports two spatial audio formats:

  • First Order Ambisonics consists entirely of spatialized audio, meaning that all of the sound in the video will respond to where the viewer is looking in the video. YouTube decodes the Ambisonic soundtrack to binaural stereo using Head Related Transfer Functions (HRTFs).
  • First Order Ambisonics & Head-Locked Stereo consists of a mix of spatialized audio along with head-locked stereo audio. Head-Locked audio bypasses the Ambisonic decoder / binaural renderer and does not change when a user turns their head. Head-Locked stereo is often used for narration or background music.

YouTube's different spatial audio features can be experienced on the following platforms:

Spatial Audio Format   

Chrome  

Opera     Edge       Firefox       Android    

First Order

Ambisonics

                                                             

First Order Ambisonics

& Head-Locked Stereo

Note: On platforms that don't support head-locked stereo, YouTube will down-mix the head-locked portion of the audio track to the omni-directional component (W) of the First Order Ambisonics.

Video upload requirements for spatial audio

YouTube’s spatial audio specification defines all supported layouts and orderings, but make sure you follow these minimum requirements when using spatial audio:

  • Content you upload should follow YouTube specifications. YouTube supports the following spatial audio format types:
    • First Order Ambisonic (FOA)
      • ACN channel ordering
      • SN3D normalization
      • The 4 FOA components should be ordered as W, Y, Z, X as a 4-channel audio track in your uploaded file
    • First Order Ambisonics (FOA) with head-locked stereo.
      • ACN channel ordering
      • SN3D normalization
      • The 4 FOA components with head-locked stereo audio [L, R] should have the ordering W, Y, Z, X, L, R as a 6-channel audio track in your uploaded file.
    • For each case defined above, the metadata tool should be used to automatically insert Spatial Audio metadata into the file prior to uploading. The metadata is necessary to enable YouTube to identify your file as containing spatial audio. If your post-production tools already mark metadata per the YouTube spec, you don't need to use the metadata tool.
  • MP4 files with AAC encoded audio are supported.
  • MOV files with AAC encoded audio or PCM encoded audio are also supported.
  • AAC sample rates and bitrates should use YouTube encoding recommendations. AAC sample rates should be 96 khz or 48 khz and bitrates should be 512 kbps. For best quality, we recommend that you use an uncompressed PCM audio format.
  • Only one audio track is supported. Multiple audio tracks, such as tracks with spatial and stereo/mono in the same file, are not supported.

Upload videos with spatial audio

  1. Create a 360-degree video with spatial audio following the video requirements. Learn how to upload 360-degree videos.
  2. Run the metadata tool on the video (we recommend downloading the latest version).
  3. Upload the video to YouTube.
Note: Don't use the YouTube video editor since it's not yet supported on 360-degree and VR videos.

Preview spatial audio on VR videos

You can use the Resonance Audio Monitor VST plugin, to preview the spatial audio on your VR videos before uploading them. The Resonance Audio Monitor VST plugin works with any Digital Audio Workstation that can render 4-6 channel audio files and hosts VST plugins. Learn more about using Resonance Audio Monitor.

Was this article helpful?
How can we improve it?