]> git.proxmox.com Git - mirror_ubuntu-bionic-kernel.git/blame - Documentation/sound/designs/compress-offload.rst
x86/speculation/mds: Add mitigation control for MDS
[mirror_ubuntu-bionic-kernel.git] / Documentation / sound / designs / compress-offload.rst
CommitLineData
e9df12c3
TI
1=========================
2ALSA Compress-Offload API
3=========================
4
5Pierre-Louis.Bossart <pierre-louis.bossart@linux.intel.com>
6
7Vinod Koul <vinod.koul@linux.intel.com>
57bd9b8d 8
57bd9b8d 9
e9df12c3
TI
10Overview
11========
57bd9b8d
PLB
12Since its early days, the ALSA API was defined with PCM support or
13constant bitrates payloads such as IEC61937 in mind. Arguments and
14returned values in frames are the norm, making it a challenge to
15extend the existing API to compressed data streams.
16
17In recent years, audio digital signal processors (DSP) were integrated
18in system-on-chip designs, and DSPs are also integrated in audio
19codecs. Processing compressed data on such DSPs results in a dramatic
20reduction of power consumption compared to host-based
21processing. Support for such hardware has not been very good in Linux,
22mostly because of a lack of a generic API available in the mainline
23kernel.
24
c94bed8e 25Rather than requiring a compatibility break with an API change of the
57bd9b8d
PLB
26ALSA PCM interface, a new 'Compressed Data' API is introduced to
27provide a control and data-streaming interface for audio DSPs.
28
29The design of this API was inspired by the 2-year experience with the
30Intel Moorestown SOC, with many corrections required to upstream the
31API in the mainline kernel instead of the staging tree and make it
32usable by others.
33
57bd9b8d 34
e9df12c3
TI
35Requirements
36============
57bd9b8d
PLB
37The main requirements are:
38
39- separation between byte counts and time. Compressed formats may have
40 a header per file, per frame, or no header at all. The payload size
41 may vary from frame-to-frame. As a result, it is not possible to
42 estimate reliably the duration of audio buffers when handling
43 compressed data. Dedicated mechanisms are required to allow for
44 reliable audio-video synchronization, which requires precise
45 reporting of the number of samples rendered at any given time.
46
47- Handling of multiple formats. PCM data only requires a specification
48 of the sampling rate, number of channels and bits per sample. In
49 contrast, compressed data comes in a variety of formats. Audio DSPs
50 may also provide support for a limited number of audio encoders and
51 decoders embedded in firmware, or may support more choices through
52 dynamic download of libraries.
53
54- Focus on main formats. This API provides support for the most
55 popular formats used for audio and video capture and playback. It is
56 likely that as audio compression technology advances, new formats
57 will be added.
58
59- Handling of multiple configurations. Even for a given format like
60 AAC, some implementations may support AAC multichannel but HE-AAC
61 stereo. Likewise WMA10 level M3 may require too much memory and cpu
62 cycles. The new API needs to provide a generic way of listing these
63 formats.
64
65- Rendering/Grabbing only. This API does not provide any means of
66 hardware acceleration, where PCM samples are provided back to
67 user-space for additional processing. This API focuses instead on
68 streaming compressed data to a DSP, with the assumption that the
69 decoded samples are routed to a physical output or logical back-end.
70
e9df12c3 71- Complexity hiding. Existing user-space multimedia frameworks all
57bd9b8d
PLB
72 have existing enums/structures for each compressed format. This new
73 API assumes the existence of a platform-specific compatibility layer
74 to expose, translate and make use of the capabilities of the audio
75 DSP, eg. Android HAL or PulseAudio sinks. By construction, regular
76 applications are not supposed to make use of this API.
77
78
79Design
e9df12c3 80======
c9f3f2d8 81The new API shares a number of concepts with the PCM API for flow
57bd9b8d
PLB
82control. Start, pause, resume, drain and stop commands have the same
83semantics no matter what the content is.
84
85The concept of memory ring buffer divided in a set of fragments is
86borrowed from the ALSA PCM API. However, only sizes in bytes can be
87specified.
88
89Seeks/trick modes are assumed to be handled by the host.
90
91The notion of rewinds/forwards is not supported. Data committed to the
92ring buffer cannot be invalidated, except when dropping all buffers.
93
94The Compressed Data API does not make any assumptions on how the data
95is transmitted to the audio DSP. DMA transfers from main memory to an
96embedded audio cluster or to a SPI interface for external DSPs are
97possible. As in the ALSA PCM case, a core set of routines is exposed;
98each driver implementer will have to write support for a set of
99mandatory routines and possibly make use of optional ones.
100
101The main additions are
102
e9df12c3
TI
103get_caps
104 This routine returns the list of audio formats supported. Querying the
105 codecs on a capture stream will return encoders, decoders will be
106 listed for playback streams.
107
108get_codec_caps
109 For each codec, this routine returns a list of
110 capabilities. The intent is to make sure all the capabilities
111 correspond to valid settings, and to minimize the risks of
112 configuration failures. For example, for a complex codec such as AAC,
113 the number of channels supported may depend on a specific profile. If
114 the capabilities were exposed with a single descriptor, it may happen
115 that a specific combination of profiles/channels/formats may not be
116 supported. Likewise, embedded DSPs have limited memory and cpu cycles,
117 it is likely that some implementations make the list of capabilities
118 dynamic and dependent on existing workloads. In addition to codec
119 settings, this routine returns the minimum buffer size handled by the
120 implementation. This information can be a function of the DMA buffer
121 sizes, the number of bytes required to synchronize, etc, and can be
122 used by userspace to define how much needs to be written in the ring
123 buffer before playback can start.
124
125set_params
126 This routine sets the configuration chosen for a specific codec. The
127 most important field in the parameters is the codec type; in most
128 cases decoders will ignore other fields, while encoders will strictly
129 comply to the settings
130
131get_params
132 This routines returns the actual settings used by the DSP. Changes to
133 the settings should remain the exception.
134
135get_timestamp
136 The timestamp becomes a multiple field structure. It lists the number
137 of bytes transferred, the number of samples processed and the number
138 of samples rendered/grabbed. All these values can be used to determine
139 the average bitrate, figure out if the ring buffer needs to be
140 refilled or the delay due to decoding/encoding/io on the DSP.
57bd9b8d
PLB
141
142Note that the list of codecs/profiles/modes was derived from the
143OpenMAX AL specification instead of reinventing the wheel.
144Modifications include:
145- Addition of FLAC and IEC formats
146- Merge of encoder/decoder capabilities
147- Profiles/modes listed as bitmasks to make descriptors more compact
148- Addition of set_params for decoders (missing in OpenMAX AL)
149- Addition of AMR/AMR-WB encoding modes (missing in OpenMAX AL)
150- Addition of format information for WMA
151- Addition of encoding options when required (derived from OpenMAX IL)
152- Addition of rateControlSupported (missing in OpenMAX AL)
153
e9df12c3 154
9727b490
JK
155Gapless Playback
156================
157When playing thru an album, the decoders have the ability to skip the encoder
158delay and padding and directly move from one track content to another. The end
8d84c197 159user can perceive this as gapless playback as we don't have silence while
9727b490
JK
160switching from one track to another
161
162Also, there might be low-intensity noises due to encoding. Perfect gapless is
163difficult to reach with all types of compressed data, but works fine with most
164music content. The decoder needs to know the encoder delay and encoder padding.
165So we need to pass this to DSP. This metadata is extracted from ID3/MP4 headers
166and are not present by default in the bitstream, hence the need for a new
167interface to pass this information to the DSP. Also DSP and userspace needs to
168switch from one track to another and start using data for second track.
169
170The main additions are:
171
e9df12c3
TI
172set_metadata
173 This routine sets the encoder delay and encoder padding. This can be used by
174 decoder to strip the silence. This needs to be set before the data in the track
175 is written.
9727b490 176
e9df12c3
TI
177set_next_track
178 This routine tells DSP that metadata and write operation sent after this would
179 correspond to subsequent track
9727b490 180
e9df12c3
TI
181partial drain
182 This is called when end of file is reached. The userspace can inform DSP that
183 EOF is reached and now DSP can start skipping padding delay. Also next write
184 data would belong to next track
9727b490
JK
185
186Sequence flow for gapless would be:
187- Open
188- Get caps / codec caps
189- Set params
190- Set metadata of the first track
191- Fill data of the first track
192- Trigger start
193- User-space finished sending all,
242658ff 194- Indicate next track data by sending set_next_track
9727b490
JK
195- Set metadata of the next track
196- then call partial_drain to flush most of buffer in DSP
197- Fill data of the next track
198- DSP switches to second track
e9df12c3 199
9727b490
JK
200(note: order for partial_drain and write for next track can be reversed as well)
201
57bd9b8d 202
e9df12c3
TI
203Not supported
204=============
57bd9b8d
PLB
205- Support for VoIP/circuit-switched calls is not the target of this
206 API. Support for dynamic bit-rate changes would require a tight
207 coupling between the DSP and the host stack, limiting power savings.
208
209- Packet-loss concealment is not supported. This would require an
210 additional interface to let the decoder synthesize data when frames
211 are lost during transmission. This may be added in the future.
212
213- Volume control/routing is not handled by this API. Devices exposing a
214 compressed data interface will be considered as regular ALSA devices;
215 volume changes and routing information will be provided with regular
216 ALSA kcontrols.
217
218- Embedded audio effects. Such effects should be enabled in the same
219 manner, no matter if the input was PCM or compressed.
220
221- multichannel IEC encoding. Unclear if this is required.
222
223- Encoding/decoding acceleration is not supported as mentioned
224 above. It is possible to route the output of a decoder to a capture
225 stream, or even implement transcoding capabilities. This routing
226 would be enabled with ALSA kcontrols.
227
228- Audio policy/resource management. This API does not provide any
b327d25c 229 hooks to query the utilization of the audio DSP, nor any preemption
57bd9b8d
PLB
230 mechanisms.
231
b327d25c 232- No notion of underrun/overrun. Since the bytes written are compressed
57bd9b8d 233 in nature and data written/read doesn't translate directly to
b327d25c 234 rendered output in time, this does not deal with underrun/overrun and
57bd9b8d
PLB
235 maybe dealt in user-library
236
e9df12c3
TI
237
238Credits
239=======
57bd9b8d
PLB
240- Mark Brown and Liam Girdwood for discussions on the need for this API
241- Harsha Priya for her work on intel_sst compressed API
242- Rakesh Ughreja for valuable feedback
243- Sing Nallasellan, Sikkandar Madar and Prasanna Samaga for
244 demonstrating and quantifying the benefits of audio offload on a
245 real platform.