Skip to content

RDP: Add support for audio input redirection ([MS-RDPEAI])

Pascal Nowack requested to merge pnowack/gnome-remote-desktop:audio-input into master

Quoting the main commit here:

Audio input redirection effectively redirects microphone input from the
client to the server.
To accomplish this, create an audio source using PipeWire. PipeWire will
then, when the respective audio stream runs, ask for samples, when they
are requested.

On the RDP side, audio input redirection uses a DVC with the name
AUDIO_INPUT.
When this DVC is opened, the first thing that happens is a negotiation
process, where the protocol version is negotiated and the audio formats.
At least, PCM as audio format MUST be supported by both sides.
After the negotiation, the client sends audio samples. This happens
until the DVC is closed again.
As a result, it is possible to receive audio samples, even when they are
not needed.
In the case of MS Windows RDS, MS Windows RDS only opens the AUDIO_INPUT
DVC, when a program requests microphone input and closes that DVC, when
no program requests microphone input anymore.
The same can be done with PipeWire: When the PW_KEY_NODE_SUSPEND_ON_IDLE
property is set, PipeWire will automatically suspend the audio stream.
So, set that property.
As state changes are communicated by PipeWire, close the AUDIO_INPUT
DVC, when the stream suspends, and open it when the stream starts
(again).

When the negotiation process is over, the runtime phase is entered.
In this phase, the client firsts sends an Incoming Data PDU to indicate,
that audio samples are about to be transferred, followed by a Data PDU,
containing the actual (encoded) audio samples.
This process is repeated for each Data PDU, meaning that each Data PDU
is preceded by an Incoming Data PDU.
This theoretically allows the server side to measure the clients uplink,
which may be useful for version 2 of the protocol, where the server side
can issue format changes during the session, when the AUDIO_INPUT DVC
has already been opened.

In addition to that, do not treat stray Incoming Data PDUs as protocol
error. Theoretically, they are protocol errors, but not fatal ones.
The reason for this is a bug in Microsofts iOS clients, where additional
Incoming Data PDUs are sent during the session.

For now, only implement version 1 of the audio input redirection
protocol, but advertise version 2 of the protocol to the client, since
there are no disadvantages when doing that.
This is doable, since version 2 of the protocol only has the addition,
that the server side can change the audio format during runtime.
Since A-law does not know any quality options, there is for now no need
for any quality change.
In the future, a format change can be implemented, when there are e.g.
bandwidth constraints.

Depends on !193 (merged) (commits included here)
Built on top of !194 (merged) (commits included here)

Closes: #175 (closed)

Edited by Pascal Nowack

Merge request reports