Idle gc and animations
Submitted by dra..@..il.com
Link to original bug (#728214)
Description
Just pasting IRC conversation to not get lost:
* andreimac (~AndreiMac@188.26.13.190) hat #gnome-shell betreten
* Kobra hat die Verbindung getrennt ()
* weld__ (~weld@p57A8269E.dip0.t-ipconnect.de) hat #gnome-shell betreten
* weld_ hat die Verbindung getrennt (Ping timeout: 600 seconds)
* RavetcoFX hat die Verbindung getrennt (Ping timeout: 600 seconds)
* jackneill hat die Verbindung getrennt (Remote closed the connection)
* garnacho hat die Verbindung getrennt (Ping timeout: 600 seconds)
* gcampax (~gcampax@128.66.200.113) hat #gnome-shell betreten
`<Jasper>` rtcm, he appeared
`<Jasper>` gcampax, cosimoc, rtcm: let's talk some more about gjs
`<gcampax>` Jasper: why here?
`<gcampax>` and what specifically?
`<gcampax>` (and do you need anything before 3.12.1?)
`<Jasper>` rtcm wants to do a 1.40.1 release
`<gcampax>` rtcm: oh you're doing it? awesome
`<drago01>` gcampax: https://git.gnome.org/browse/gjs/commit/?id=791b1a33424897549f487eb75a80f13c4f94437a
`<drago01>` gcampax: why do we run the gc during the animation?
`<gcampax>` rtcm: btw, if you want I can check what is safe to backport and what is master-only
`<drago01>` gcampax: that is very broken
`<gcampax>` drago01: because we run the gc in an (default priority?) idle
`<drago01>` gcampax: idle shouldn't be during the animation though
`<drago01>` gcampax: its animating so it isn't idle
`<gcampax>` drago01: oh indeed it should - if it isn't, we're already dropping frames
`<gcampax>` the mainloop is stopping until the next frame, so it's idle
`<gcampax>` (on platforms with INTEL_swap_events, that is)
`<drago01>` ouch
`<drago01>` so if the gc takes longer than a frame
`<drago01>` we will drop a frame
`<gcampax>` yes
`<drago01>` ugh
`<gcampax>` and that is unfortunately quite likely, a full gc takes time
`<drago01>` so we shouldn't do that
`<gcampax>` we should be using incremental gcs, except there is no public api for that
`<gcampax>` (there is "friend" api, but it's quite low level, I'm not sure it's safe to use)
`<drago01>` well gc during an animation makes no sense
`<drago01>` incremental or not
`<drago01>` its not like you are going to run out of memory in that short time
`<drago01>` gcampax: we used to use the run_at_leisure thing in the past
`<drago01>` which runs when nothing is going on
`<drago01>` and then we reverted it
`<gcampax>` drago01: without a generational gc, you need to keep running the gc all the time
* tham hat die Verbindung getrennt (tham)
`<drago01>` gcampax: well I am not saying don't run it at all
`<drago01>` gcampax: just not during animations
`<drago01>` gcampax: so we either need to tell gjs "block gc" .. "unblock gc"
`<Jasper>` yeah, that was the point of begin_busy_work / end_busy_work
`<drago01>` gcampax: or let the shell decide when to do it
`<drago01>` gcampax: which makes a lot more sense
`<Jasper>` I think telling it "block gc" / "unblock gc" makes sense.
`<Jasper>` We could even add it to gjs's tweener
`<gcampax>` drago01: most of the JS side of the shell is always running animations, you end up running the gc too seldom
`<Jasper>` So all apps that use tweener don't become blocked
`<Jasper>` gcampax, that's not true.
`<drago01>` Jasper: yeah that makes sense
`<gcampax>` Jasper: how so? menus have animations, buttons have animations, notifications, search, windows... everything is animated
`<gcampax>` but the actual windows, which mutter takes care of
* RavetcoFX (~RavetcoFX@S010678542e57d3dd.cg.shawcable.net) hat #gnome-shell betreten
`<drago01>` gcampax: date changes?
`<Jasper>` gcampax, sure, but they're short fire animations. You open a menu, it animates, and then you run a GC after the animation completes
`<Jasper>` We don't have any long-running animations where blocking the GC hurts.
`<drago01>` gcampax: wireless signal strength
`<drago01>` (and other callbacks for whatever events)
`<gcampax>` drago01: those should not create (noticeable) garbage
`<gcampax>` it's a couple of strings at most
`<drago01>` gcampax: yes but if we have accumulated garbage it should clean it ip
`<gcampax>` drago01: right ok, but the thing is, you need to clean up the garbage immediately after you created it - if you wait and have other js code in between, you get fragmentation
`<cosimoc>` Jasper, hey
`<drago01>` gcampax: maybe but avoiding fragmentation should not come at the cost of hickups during animations
* satellit_e hat die Verbindung getrennt (Read error: 110 (Connection timed out))
`<drago01>` gcampax: the former isn't really noticeable while the latter hurts badly
* satellit_e (~satellit@c-174-61-128-216.hsd1.wa.comcast.net) hat #gnome-shell betreten
`<gcampax>` drago01: heh... if you hw can't do 60fps anyway but has 2gb or less of total memory the priority is suddenly inverted
`<drago01>` (we never do more then 30fps on open source drivers now)
* andreimac hat die Verbindung getrennt (Leaving)
`<cosimoc>` should we maybe have a way to opt out of automatic GC cycles? why is rate limiting not enough?
`<drago01>` cosimoc: that what the block / unlock suggestion is about
`<drago01>` cosimoc: tweener says "blcok" ... does work ... "unblock"
`<drago01>` cosimoc: so it does not run during the animation
`<gcampax>` drago01: anyway, we can certainly block GCs during animations, if incremental GCs are frowned upon (the neat thing of incremental GC is that you can time-limit them, and say like give me 5ms of gc, then I have 10ms of work, then 5ms of more GC, etc.)
`<gcampax>` drago01: fwiw, firefox runs a MaybeGC every other frame
`<gcampax>` so and it doesn't block it during animations
`<gcampax>` so we just need a better heuristics, imho
`<cosimoc>` drago01, I understand that, but we don't have anything like Tweener that works in general for all apps
`<Jasper>` cosimoc, we don't?
`<cosimoc>` drago01, only the shell knows exactly when it's not doing work
`<cosimoc>` Jasper, that's what I mean
`<drago01>` gcampax: when do we run the gc currently?
`<drago01>` gcampax: i.e with the latest gjs changes?
`<gcampax>` drago01: in a idle that is queued after every JS callback returns to C code
`<gcampax>` and the idle results in a MaybeGC always, a full GC at most every 5 frames
* andreimac (~AndreiMac@188.26.13.190) hat #gnome-shell betreten
`<drago01>` gcampax: ok
`<drago01>` cosimoc: yeah that's why I think the shell should be the one doing it
`<drago01>` cosimoc: like we did in the past
`<drago01>` cosimoc: not sure why we removed it
`<drago01>` cosimoc: maybe after some mozjs update we though its not needed anymore
`<cosimoc>` drago01, we originally removed it because bugs in mozjs
`<drago01>` oh indeed
`<cosimoc>` drago01, but that misses the point
`<cosimoc>` drago01, GTK apps still need garbage collection...
`<drago01>` cosimoc: how so?
`<drago01>` cosimoc: yeah ok but the same applies there "the app knows when it is not doing work"
`<drago01>` cosimoc: but not sure it hurts that bad as a compositor
`<drago01>` cosimoc: that has do do full screen animations
`<rtcm>` gcampax: hey, I was thinking of doing a gjs 1.40.1 including all your patches from [bug 725024](https://bugzilla.gnome.org/show_bug.cgi?id=725024) - is there anything else that should be included?
`<Services>` Bug http://bugzilla.gnome.org/show_bug.cgi?id=725024 normal, Normal, ---, gjs-maint, RESOLVED FIXED, Some GC fixes, improvements and refactors
`<cosimoc>` drago01, the app doesn't really know in the GTK case
`<gcampax>` rtcm: i'd say you should include anything in master but paramspecs
`<cosimoc>` drago01, and that's the reason you want to do GC from a low idle, as that's done after drawing
`<gcampax>` rtcm: in particular the fundamental fixes are quite important (but not "fundamental: deduplicate code", that's from the paramspec series)
`<drago01>` rtcm: uh no we shouldnT' backport the "gc during animation" thing imo
* antoniof (~antonio@bl17-192-115.dsl.telepac.pt) hat #gnome-shell betreten
`<drago01>` rtcm: people are already complaining about 3.12 having worse performance then 3.10 that would make it even worse
`<drago01>` cosimoc: the point is you cannout control how long a gc takes
`<drago01>` cosimoc: so if it takes to long it will block your next frame
`<cosimoc>` drago01, I understand that - not a reason not to ever GC imo
`<drago01>` cosimoc: " not a reason not to ever GC" ... I didn't suggest that
`<drago01>` cosimoc: its not about the if
`<drago01>` cosimoc: its about the when
`<gcampax>` drago01: people are also complaining that shell takes gbs of memory - we finally have a fix for that, we should put that to use
`<drago01>` gcampax: not at that cost imo we could do better
`<gcampax>` drago01: besides, we're not running GCs during animations, we're running GCs during one specific animation - the initial loading the overview (which allocates a huge amount of stuff)
`<cosimoc>` drago01, using application-provided knowledge on what's the best moment to do GC is a good strategy
`<gcampax>` but that has never been fast, because you have texture submission, shader compiles and lots of mutex contention for the background threads anyway
`<rtcm>` gcampax: hmm, I'd like to keep it minimal - I actually just wanted a release for the crash on the shell zoom feature (Lionel's patch). perhaps then it's better if you do the release since you know the code
`<cosimoc>` drago01, I'm arguing that you can't just derive that knowledge from gjs itself
`<cosimoc>` drago01, and what works for the shell doesn't work for GTK apps
`<gcampax>` rtcm: ok, so no GC at all? I'd like to avoid cherry-picking here and there, branches have only been tested as a whole
`<drago01>` gcampax: do you have before and after perf numbers? i.e overviewfirstfps and memory after overview
`<drago01>` gcampax: that would tell us 1) does it hurt if yes how much 2) how much memory do we really save that way
`<gcampax>` drago01: no, just my feeling - fwiw, I don't even know if the ratelimiting helped really
`<gcampax>` drago01: is there a good way to get the number? one that works without debugging the perf tool, that is
`<drago01>` gcampax: what do you mean by debugging the perf tool? you have to just run jt (and comment out the gc calls from perf.js)
* satellit_e hat die Verbindung getrennt (Connection timed out)
`<rtcm>` gcampax: I mean, if you're confident that it's all 3.12.1 material then I'm fine with just spinning a tarball. it all seems to work for me but then I've never had the dreaded crash that adamw keeps reporting
* satellit_e (~satellit@c-174-61-128-216.hsd1.wa.comcast.net) hat #gnome-shell betreten
`<cosimoc>` rtcm, gcampax, please include the gtk override fix in the point release too
`<gcampax>` cosimoc: yes
* dave_largo hat die Verbindung getrennt (Client exiting)
`<gcampax>` rtcm: don't worry, i'll take care of it
`<drago01>` gcampax: re firefox http://mozilla.6506.n7.nabble.com/Garbage-collector-pauses-and-smooth-animation-td165808.html
`<gcampax>` we'll have the re-entrant gc on 3.12.1 (adamw's crash), but not the maybegc stuff
`<drago01>` gcampax: seems like people do complain about running gc during aninmation there as well (the pdf link is dead though)