Quoting the main commit here:
Audio input redirection effectively redirects microphone input from the client to the server. To accomplish this, create an audio source using PipeWire. PipeWire will then, when the respective audio stream runs, ask for samples, when they are requested. On the RDP side, audio input redirection uses a DVC with the name AUDIO_INPUT. When this DVC is opened, the first thing that happens is a negotiation process, where the protocol version is negotiated and the audio formats. At least, PCM as audio format MUST be supported by both sides. After the negotiation, the client sends audio samples. This happens until the DVC is closed again. As a result, it is possible to receive audio samples, even when they are not needed. In the case of MS Windows RDS, MS Windows RDS only opens the AUDIO_INPUT DVC, when a program requests microphone input and closes that DVC, when no program requests microphone input anymore. The same can be done with PipeWire: When the PW_KEY_NODE_SUSPEND_ON_IDLE property is set, PipeWire will automatically suspend the audio stream. So, set that property. As state changes are communicated by PipeWire, close the AUDIO_INPUT DVC, when the stream suspends, and open it when the stream starts (again). When the negotiation process is over, the runtime phase is entered. In this phase, the client firsts sends an Incoming Data PDU to indicate, that audio samples are about to be transferred, followed by a Data PDU, containing the actual (encoded) audio samples. This process is repeated for each Data PDU, meaning that each Data PDU is preceded by an Incoming Data PDU. This theoretically allows the server side to measure the clients uplink, which may be useful for version 2 of the protocol, where the server side can issue format changes during the session, when the AUDIO_INPUT DVC has already been opened. In addition to that, do not treat stray Incoming Data PDUs as protocol error. Theoretically, they are protocol errors, but not fatal ones. The reason for this is a bug in Microsofts iOS clients, where additional Incoming Data PDUs are sent during the session. For now, only implement version 1 of the audio input redirection protocol, but advertise version 2 of the protocol to the client, since there are no disadvantages when doing that. This is doable, since version 2 of the protocol only has the addition, that the server side can change the audio format during runtime. Since A-law does not know any quality options, there is for now no need for any quality change. In the future, a format change can be implemented, when there are e.g. bandwidth constraints.