cogl: Remove GLX "threaded swap wait" used on Nvidia

The single purpose of "threaded swap wait" was to provide the value:
`u.presentation_time = get_monotonic_time_ns ();` for use by
`clutter-stage-cogl`.

Until recently (before !363), all backends were required to provide
a nonzero value for `presentation_time` or else suffer falling back
to poor-performing throttling methods in `master_clock_next_frame_delay`.
So we needed "threaded swap wait" to support the Nvidia driver.

This is no longer true. The fallbacks don't exist any more and
`clutter_stage_cogl_schedule_update` now always succeeds even in the
absence of a `presentation_time` (since !363).

The drawbacks to keeping "threaded swap wait" are:

  * `u.presentation_time = get_monotonic_time_ns ();` is a guess and not
    an accurate hardware presentation time.
  * It required blocking the main loop on every frame in
    `_cogl_winsys_wait_for_gpu` due to `glFinish`. Any OpenGL programmer
    will tell you calling `glFinish` is a bad idea because it kills CPU-GPU
    parallelism. In my case, it was blocking the main loop for 1-3ms on
    every mutter frame. It's easy to imagine slower (or higher resolution)
    Nvidia systems would lose an even larger chunk of their frame interval
    blocked in that function. This significantly crippled frame rates on
    Nvidia systems.

The benefit to keeping "threaded swap wait" is:

  * Its guess of `presentation_time` is likely a better guess by a few
    milliseconds than the guess that `clutter_stage_cogl_schedule_update`
    will make in its place.

So "threaded swap wait" provided better sub-frame phase accuracy, but at
the expense of frame rates. And as soon as it starts causing frame drops,
that one and only benefit is lost. There is no reason to keep it.

And in case you are wondering, the documentation for "threaded swap wait"
is now wrong (since !363):

  > The advantage of enabling this is that it will allow your main loop
  > to do other work while waiting for the system to be ready to draw
  > the next frame, instead of blocking in glXSwapBuffers()."

At the time (before !363) it was true that "threaded swap wait" avoided
swap interval throttling that would occur as a result of
`master_clock_next_frame_delay` blindly returning zero and over-queuing
frames. That code no longer exists. And ironically the implementation of
"threaded swap wait" necessitates the same kind of blocking (to a lesser
extent) that it was designed to avoid. We can eliminate all blocking
however by deleting "threaded swap wait", which is now safe since !363.

GNOME/mutter!602
4 jobs for !602 with remove-threaded-swap-wait in 10 minutes and 14 seconds
detached
Status Job ID Name Coverage
  Review
passed #325085
check-commit-log

00:00:24

 
  Build
passed #325086
build-mutter

00:05:26

 
  Test
passed #325088
can-build-gnome-shell

00:01:08

passed #325087
test-mutter

00:04:23