Low performance due to suboptimal GStreamer element selection with VA-API Flatpak package and on Optimus systems
I'm testing with the FM DUO Harley Quinn 7680x3840@60 h265 demo video, which can be found here. As you can already guess from the resolution, it's a pretty heavy-weight video.
Furthermore, I have an Optimus system with an 11th Gen Intel iGPU and an NVIDIA RTX 3050 Ti as dGPU. I'm driving all of this from a monitor that is on the iGPU, so when the dGPU is used for video decoding, CPU or GPU-based copying must happen to get things on the screen.
The performance out of the box is very slow and just a couple of frames per second on my system.
Details
It seems that, out of the box, a suboptimal GStreamer pipeline is selected. To be more precise, I'm experiencing the following behaviour:
- If
runtime/org.freedesktop.Platform.GStreamer.gstreamer-vaapi/x86_64/23.08
is installed:-
vaapih265dec
is selected by GStreamer. - The Intel iGPU is selected by VA-API, which, and
intel_gpu_top
indicates hardware video decoding is used. - Livi incorrectly complains that hardware video decoding is not used.
- Despite hardware video decoding being used, Livi is correct in that the video plays very slowly.
-
- If it is not installed:
-
nvh265dec
is selected by GStreamer, without any OpenGL elements despite supportingmemory:GLMemory
, only software-based elements such asvideoconvert
andvideoscale
. - The NVIDIA dGPU is being used for video decoding.
- Livi incorrectly complains that hardware video decoding is not used.
- Despite hardware video decoding being used, Livi is correct in that the video plays very slowly.
-
- If it is not installed, and I forcibly disable the NVIDIA GPU, making my system effectively Intel-only:
-
vah265dec
is selected by GStreamer. - The Intel iGPU is selected by VA-API, which, and
intel_gpu_top
indicates hardware video decoding is used. - Livi correctly does not complain that hardware video decoding is not used.
- Hardware video decoding is used, and performance is great. I notice that OpenGL elements are used in the pipeline as well.
-
I assume the first scenario is pretty much irrelevant since the vaapi
class of decoders are now deprecated with GStreamer 1.24 and the va
ones are now preferred. This is still strange since I don't understand why it is then that the vaapi
ones are getting precedence with GStreamer 1.24 if that package is installed, but the fix in my case is just to remove the package.
The second scenario is strange, as nvh265dec
announces memory:GLMemory
. I must admit I've never had success in getting that to work in my own projects either, possibly because the NVIDIA dGPU is not the primary GPU running GNOME Shell and it usually results in OpenGL errors about the context not being active or similar.
Even with hardware video decoding on the GPU in place and using OpenGL, there would probably still need to be a cross-GPU copy or import because the iGPU is driving the display.
The third scenario makes sense because it's the intended scenario where hardware video decoding with offloading is used and everything can be offloaded directly.
Next Steps?
I realize that this may not really be a bug in Livi, but wanted to put this here for visibility first anyway as Livi's mission is to be hardware accelerated and performant.
I feel that with the current state of things, the iGPU should probably be preferred for decoding because it is the only path that is performant. Should the next step be to create bug reports with GStreamer upstream for not selecting the most optimal elements on an Optimus setup and with conflicting vaapi
and libva
?
EDIT: GST_PLUGIN_FEATURE_RANK=nvh265dec:0,nvh264dec:0
can be passed through Flatseal or similar to force usage of the va elements for now.