Monitor screencasting during direct scanout is blocking KMS from flipping the buffer
Fedora 39 Silverblue, Wayland
I noticed a peculiar thing when looking at Tracy profiles of a fullscreen game + monitor screencasting:
Even though we're flipping 3 ms before the vblank, the frame isn't making it in time. Later on in the profile, there are regions where the flip occurs 6 ms before vblank, then it makes it in time every frame.
A possible cause is the screencast drawing operation, which samples from the game's buffer, taking a while and delaying KMS, due to AMDGPU not distinguishing between reads and writes for synchronization. From @daenzer on Matrix:
YaLTeR: could painting to the screencast framebuffer gpu work somehow delay it?
Michel Dänzer: That could be it, assuming that copies from the scanout buffer; we could avoid this by passing a sync_file via the KMS
IN_FENCE_FD
propertyjadahl: what's the reason the kernel waits for copies to finish before being able to scan out?
Michel Dänzer: amdgpu doesn't distinguish between read or write access for synchronization