Tracking or face detection, and blurring or silly hats

Background

There are some face blurring plugins in GStreamer:

$ gst-inspect-1.0 | grep -i faceb  
opencv:  faceblur: faceblur  
frei0r:  frei0r-filter-facebl0r: FaceBl0r

They can be used to automatically blur faces. But when you need to blur a face, you want to make sure it's blurred all the time. The face detection algorithm cannot guarantee this, so they have a serious limitation. Not sure what's the purpose of these two plugins to be honest.

It's enough for one frame to miss the blur, and the face will be revealed. This would be a big problem when the subject of the edited material is very sensitive. We should allow manually checking the automatic blur and adjusting it.

Detecting vs tracking

We can either use a face detect algorithm an run it on each frame, or allow selecting a face and track that face only from frame to frame. In either case, we should find an existing library implementing this. Implementing such algorithms is out of the scope of this project.

The face-detect method would work better when all the faces have to be blurred. The tracking method is better for blurring a specific face between many.

We'll focus on the single-face blurring use-case initially.

Keeping things simple

We'll allow initially only full-face blurring. Later if somebody is interested can add support for eyes-only blurring, for example, or single-eye-blurring to obtain a pirate effect.

Face blur workflow

Example of how the functionality can be used:

EditorPerspective:
- Select a clip
- Middle tab > Clip tab > "Object tracking" expander
  - Shows a list of previously tracked objects for the asset backing the clip, each with a Blur button to apply an effect which blurs that object in the clip.
  - Track new object button to open the TrackPerspective for the asset backing the clip.
TrackPerspective:
- Allow selecting a previously tracked object
- Allow tracking a new object:
  - Allow selecting a face manually (later we can assist the user with automatically detecting the faces for the current displayed frame).
  - Press button to track the object using a tracking algorithm.
- Allow going through the video frame by frame to check the object is correctly detected at all times.
- Allow correcting the position.

Tracking

The tracking could be done for example with OpenCV, since it allows using multiple trackers through the same API. To have the tracking functionality easily reusable, we could have a new tracker element in the GStreamer opencv plugin in gst-plugins-bad/ext/opencv. The element would add GstVideoRegionOfInterestMeta on the video buffers, without doing anything else to the streams.

In Pitivi we'd create a pipeline using that element and connect to the element to extract the metadata it provides. For a similar pipeline example see how we extract the audio levels for the audio clip previewers in AudioPreviewer._launch_pipeline in pitivi/timeline/previewers.py.

To be determined for a GSoC project proposal

Details on how to implement the tracker GStreamer element.
Details on how to store the info about tracked objects in the Asset's metadata.
UI mockups and details
Order in which these features can be implemented such that the changes can be merged as each feature is implemented.
Which tracking algorithm should we use and why.

Suggestions or new input is welcome on how to make this simpler or better.

Edited Mar 02, 2020 by Alexandru Băluț