It was hurting performance because it requires stalling both the GL
pipeline (GPU) and the event loop (CPU) by calling
glFinish as part of
_cogl_winsys_wait_for_gpu on every frame.
Threaded swap wait was definitely a good idea when it was written. It was written to avoid unthrottled (high CPU) rendering on the Nvidia driver. But more recently commit e415cc53 has fixed that permanently. So now we can remove threaded swap wait and avoid the performance hit it incurs.
Partial fix for #700 (closed)