Android Audio's 10 Millisecond Problem: The Android Audio Path Latency Explainer

Gabor Szanto

[For practical tips on dealing with latency, Android developers please also see Android Audio Low Latency Primer.]

[UPDATED AS OF JUNE 2016: Please see Android's 10 ms Problem? SOLVED.]

[UPDATED AS OF JANUARY 2016: Rebooting Android's 10 Millisecond Problem: Audio Latency Improvements in Android 6.0 Marshmallow]

Many mobile apps that are critically dependent on low latency audio functionality such as some games, synthesizers, DAWs (Digital Audio Workstations), interactive audio apps and virtual instrument apps, and the coming wave of virtual reality apps, all of which thrive on Apple's platform (App Store + iOS devices) --- and generate big revenues for App Store and iOS developers are largely non-existent on Android.

Android Audio's 10 Millisecond Problem, a little understood yet extremely difficult technical challenge with enormous ramifications, prevents these sorts of revenue producing apps from performing in an acceptable manner and even being published (!) on Android at this point in time.

Startups and developers are unwilling to port and publish otherwise successful iOS apps (with ~10 ms audio latency needs) on Android for fear of degraded audio performance resulting in negative word-of-mouth and a hit to their professional reputation and brand.

Consumers lose because have a strong desire to buy such apps on Android, as shown by revenue data on iOS, and currently, are unable to do so. One can appreciate the scale of this problem/opportunity when one takes into account the so-called 'next billion' consumers who will be 'mobile-only'.

We want to solve this. This explainer provides an easily understood overview of the Android 10 Millisecond Problem with actual latency data from the Google Nexus 9.

Contents

How Android's 10 Millisecond Problem and Android Audio Path Latency Impacts App Developers and Android OEMs

Even though music apps make up only 3% of all downloads in the iOS App Store, the Music app category is the 3rd highest revenue generating app category after Games and Social Networking. Which suggests that music apps monetize disproportionately well on platforms that offer low latency performance such as the App Store/iOS devices.

On Android, it is a different story. In the Google Play store, the Music category is not even a top five revenue producing app category.

The overwhelming majority of Android devices suffer from too high audio latency, preventing developers from building apps that would satisfy consumer demand on Android.

As such, Google and Android app developers are leaving billions of dollars on the table for Apple and iOS developers because of Android's 10 Millisecond Problem.

For the purposes of this explainer, roundtrip audio latency is simply the difference in time between when an audio input is introduced into a mobile device, undergo some sort of needed processing, and exits the same device. As any musician will tell you, we as humans are most comfortable with latencies of ~10 milliseconds. Anything significantly higher tends to disturb us.

Most Android apps have more than 100 ms of audio output latency, and more than 200 ms of round-trip (audio input to audio output) latency. To give you a quick example from the Oscar winning film Whiplash, it’s like the drummer is dragging by a half beat behind the band!

Some specific examples on how audio related applications suffer from roundtrip audio latency greater than ~10 ms:

In order to educate and inform tech industry leaders, app developers, technologists, product managers, executives, journalists, entrepreneurs, musicians, gamers and investors about the scope and ramifications of Android's 10 Millisecond Problem, one whose existence that no one benefits from, we at Superpowered have developed the explainer you are reading right now to provide an easily digestible overview of the entire Android audio chain and potential bottlenecks.

Our goal is that we rally and to unite around this challenge of 10 ms roundtrip audio latency on Android, and moreover, transform it into an opportunity that fosters innovation, better user-experiences and benefits Google Play customers, Android developers, Android OEMs and the entire Android ecosystem.

Notes about Audio Latency

Digital audio latency measurement has two useful measurement units:

We calculate the audio signal flow’s overall latency using the very best case scenario:

Android 5.0 Lollipop Audio Path Latency Explanation in Plain English

Analog audio input
Analog audio input

There may be several different analog components, such as a pre-amplifier for the built-in microphone. These analog components can be considered as "zero latency" in this case, because their true latency is typically magnitudes below 1 ms.

Latency: 0
ADC 1 ms
Analog to digital conversion (ADC)

The audio chip measures the incoming audio stream in predefined intervals and converts every measurement to a number. This predefined interval is called the sampling rate, measured in Hz. Our Mobile Audio Census and Latency Test App shows that the 48000 Hz is the native sample rate for most audio chips on Android and iOS devices, meaning that the audio stream is sampled 48000 times in every second.

Because ADC implementations often contain an oversampling filter inside, a rule of thumb is to attribute 1 ms latency for the ADC step.

Now that the audio stream has been digitized, from this point forward the audio stream is now digital audio. Digital audio almost never travels one-by-one, but rather, in chunks, called "buffers" or "periods".

Latency: 1 ms
Bus 1-6 ms
Bus transfer from the audio chip to the audio driver

The audio chip has several tasks. It handles ADC and DAC, switches between or mixes several inputs and outputs, applies volume, etc. It also "groups" the discrete digital audio samples into buffers and handles the transfer of these buffers to the operating system.

The audio chip is connected to the CPU with a bus, such as USB, PCI, Firewire, etc. Every bus has its own transfer latency depending on its internal buffer sizes and buffer counts. The latency here ranges from 1 ms (audio chip on an internal system bus) to 6 ms (USB sound card with conservative USB bus settings) typically.

Latency: 1-6 ms
Audio driver 1+ period
Audio driver (ALSA, OSS, etc.)

The audio driver receives the incoming audio into a ring buffer in "bus buffer size" steps using the audio chip's native sample rate, 48000 Hz in most cases.

This ring buffer plays an essential part in smoothing bus transfer jitter ("roughness"), and "connects" the bus transfer buffer size to the operating system audio stack's buffer size. Consuming data from the ring buffer happens in the operating system audio stack's buffer size, so it naturally adds some latency.

Android runs "on top" of Linux, and most Android devices use the most popular Linux audio driver system, ALSA (Advanced Linux Sound Architecture). ALSA handles the ring buffer like this:

  • Audio is consumed from the ring buffer in "period size" steps.
  • The ring buffer's size is a multiple of the "period size".

For example:

  • Period size = 480 samples.
  • Period count = 2.
  • The ring buffer's size is 480x2 = 960 samples.
  • Audio input is received into one period (480 samples), while the audio stack reads/processes the other period (480 samples).
  • Latency = 1 period, 480 samples. It equals to 10 ms at 48000 Hz.
ring buffer (960 samples)
period (480 samples)period (480 samples)

A common period count is 2, but some systems may go higher..

Latency: one or more periods
HAL 0+ samples
Android audio Hardware Abstraction Layer (HAL)

The HAL acts as a middleman between the Android media server and the Linux audio driver. HAL implementations are provided by the mobile device's manufacturer upon "porting" Android onto the device.

Implementations are open, vendors are free to create any kind of HAL code. Communication with the media server happens using predefined structures. The media server loads the HAL and asks to create input or output streams with optional preferred parameters such as the sample rate, buffer size or audio effects.

Note: The HAL may or may not perform according to the parameters and the media server must "adapt" to the HAL.

The typical HAL implementation is tinyALSA, which is used to communicate with the ALSA audio driver. Some vendors put closed source code here to implement audio features they feel important.

After analyzing the code of a number of open source HAL implementations in the Android source repository, we found a few quirks adding significant amount of latency and CPU load unnecessarily to the audio path due strange configurations and poor coding.

A good HAL implementation should not add any latency.

Latency: 0 or more samples
Audio Flinger 1+ period
Audio Flinger

The Android media server consists of two services:

  • The AudioPolicy service handles audio session and permission handling, such as enabling access to the microphone or interrupts on calls. It's very similar to iOS’ audio session handling.
  • The AudioFlinger service handles the digital audio streams.

Audio Flinger creates a RecordThread, which acts as a middleman between an application and the audio driver. Its basic job is:

  • Obtaining the next input audio buffer from the driver's ring buffer using the Android HAL.
  • Resampling the buffer if the application requests different sample rate than the native sample rate.
  • Performing additional buffering if the application requests different buffer size than the native period size.

Audio Flinger has a "fast mixer" path, if Android is configured that way. If a user application is using native (Android NDK) code and sets up an audio buffer queue with the native hardware sample rate and period size, no resampling, additional buffering or mixing ("MixerThread") will happen in this step.

The RecordThread works with a "push" method, without any strict synchronization to the audio driver. It tries to make an "educated guess" when to wake up and run, but the "push" method is way more sensitive to dropouts. Low latency systems always use the "pull" method, where the audio driver "dictates" audio i/o through the entire audio chain. It's clear that when Android OS was initially conceived, designed and developed, low latency audio was not a priority.

Latency: 1 period (best case scenario)
Binder
Binder

Shared memory in Android's main inter-process communication system is used to transfer the audio buffers between Audio Flinger and the user application. It's the heart of Android, used everywhere internally in Android.

Latency: 0
AudioRecord 0+ samples
AudioRecord

We are in the user application's process now. AudioRecord implements the application side of the audio input. This is a client library feature accessible via OpenSL ES for example.

AudioRecord runs a thread to periodically acquire a new buffer from Audio Flinger, with the "push" philosophy described at Audio Flinger. It doesn't add latency to the audio path if the developer sets it to work with only one buffer.

Latency: 0+ samples
User Application >1 period
User Application

Finally, the audio input reaches its destination, the user application.

Because the input and output threads are not the same, a user application must implement a ring buffer between the threads. Its size is 2 periods minimum (1 for audio input and 1 for audio output), but poorly written applications often use brute force and use more periods to solve CPU bottlenecks.

From this point, we are we start traveling back out with some audio output.

Latency: more than 1 period, near 2 typically (best case scenario)
AudioTrack 0+ samples
AudioTrack

AudioTrack Implements the user application's side of the audio output. This is a client library feature accessible via OpenSL ES for example. It runs a thread to periodically send the next audio buffer to Audio Flinger. After Android 4.4.4, AudioTrack doesn't add latency to the audio path as it can be set up to use one buffer only.

Latency: 0+ samples
Binder
Binder

Same as for audio input.

Latency: 0
Audio Flinger 1+ period
Audio Flinger

Creates a PlaybackThread, which works as the inverse of the RecordThread described at audio input.

Latency: 1 period (best case scenario)
HAL 0+ samples
Android audio HAL

Same as for audio input.

Latency: 0 or more samples
Audio driver 1+ period
Audio driver (ALSA, OSS, etc.)

Audio output in the audio driver works identically to the audio input and uses a ring buffer too.

Latency: one or more periods
Bus 1-6 ms
Bus transfer from the audio driver to the audio chip

Similar to the audio input's bus transfer, the latency here ranges from 1 ms to 6 ms typically.

Latency: 1-6 ms
DAC 1 ms
Digital to analog conversion (DAC)

The inverse of ADC, digital audio is "converted" back to analog in this point. For the same reasons at ADC, a rule of thumb is to assume 1 ms of latency for DAC.

Latency: 1 ms
Analog audio output
Analog audio output

The DAC's output signal is analog audio, but it needs additional components to drive connected devices, such as headphones. Similar to the analog audio input, the analog components can be considered to be "zero latency".

Latency: 0

Android Audio Path Latency Animation

Want to embed this image? Copy the following code:

<a href="https://superpowered.com/androidaudiopathlatency"><img src="http://bit.ly/1I9MKxo" alt="Android 5.0 Lollipop Audio Path Latency" title="Android 5.0 Lollipop Audio Path Latency" /></a><br />Learn more about <a href="https://superpowered.com/androidaudiopathlatency">Android's 10 Millisecond Problem</a>.

Android Audio Path Latency Case study: Google Nexus 9

To date, the Google Nexus 9 performs best in Android round-trip audio latency measurement tests.

The best result is 35 ms using a USB sound card or a special audio dongle directly connecting the headphone connector's mic input and output, to disable the built-in microphone array's noise canceling/feedback destroying feature which adds about ~13 ms of additional latency.

So, using the same model as above, let’s decompose the 35 ms best-case round-trip audio latency of Google Nexus 9:

How the 35 ms round-trip latency of Google Nexus 9 comes up?
ComponentSamplesMs
ADC1
Bus1
ALSA audio driver2565.3
Audio Flinger2565.3
User Application's Ring Buffer51210.6
Audio Flinger2565.3
ALSA audio driver2565.3
Bus1
DAC1
Result: 35.8

About Superpowered

Our mission is to extend the makers’ creative and productive capabilities – allowing them to create and make things real – profoundly shaping them, the builders, to build things that weren’t possible without Superpowered audio technology.

To that end, we are building technology, traversing the audio stack, that will solve Android's 10 Millisecond Problem. In the meantime, the Superpowered Audio SDK for Android and iOS is:

We'd love to hear from you. Please email us with your suggestions, comments and questions. Hello@Superpowered.com

Thanks for reading.



Gabor (@szantog) and Patrick (@Pv), founders of Superpowered

PS Please join the great conversations about Android's 10 ms problem at Reddit and Hacker News.

  • Android audio latency
  • Android latency
  • AudioFlinger
  • Android Media Server
  • Low latency audio