Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • mutter mutter
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 1,103
    • Issues 1,103
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 122
    • Merge requests 122
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GNOMEGNOME
  • muttermutter
  • Merge requests
  • !76

reduce noise during a crash

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Ray Strode requested to merge wip/halfline/silence-x-io-errors into master Apr 12, 2018
  • Overview 12
  • Commits 2
  • Pipelines 0
  • Changes 1
<adamw> mcatanzaro: mclasen: halfline: what do you guys think of https://bugzilla.redhat.com/show_bug.cgi?id=1556831 ? the reasoning kinda makes sense to me. is there a considered reason why shell explicitly aborts when it loses touch with wayland? could we change that so we don't get these fairly useless tracebacks?
<adamw> (assuming we'd get an xwayland crash report filed instead, which would likely be more useful)
<halfline> yea i presonally think it just adds noise
<halfline> same story on the other side
<mclasen> adamw: if it was easy to run without xwayland we would already do it. not sure it makes much of a difference which way we die
<halfline> the problem is whenever one side crashes both sides crash
<adamw> mclasen: the argument in the bug report is that shell should die in a way which doesn't cause abrt to kick in, basically
<halfline> and it takes effort to figure out which side crashed first
<halfline> we should suppress knock on crashes, since they're just noise not signal
<adamw> right, but this is the specific path where shell knows it lost connection to wayland...it's actually *intentionally written to abort* in that case
<adamw> it calls g_error("lost connection to xwayland") or whatever the message is, that's where we get all these abrt reports for "lost connection to xwayland" from
<adamw> there's a direct link to the line in the bug: https://gitlab.gnome.org/GNOME/mutter/blob/7e17dd00/src/wayland/meta-xwayland.c#L417
<adamw> that's what he's suggesting changing
<mcatanzaro> We had a WebKit bug recently where the web process intentionally aborted if it lost connection to the network process
<mcatanzaro> Which should only happen when the network process crashes
<mcatanzaro> But the network process was not crashing
<mcatanzaro> This bug has caused something like 2000 crashes in the past couple days
<mcatanzaro> We would never have known if we removed the web process abort
<mcatanzaro> The bug reporter was not impressed when I said the crash was intentional, and tried to convince me to change it to an exit() instead, but then we would have zero crash reports for this issue.
<adamw> mcatanzaro: the expectation here is we'd get reports for the *xwayland* crash
<mcatanzaro> adamw: Yes of course that's the expectation... that was the expectation in the WebKit case too, that we'd get reports for the network process crash
<adamw> where i'm coming from here is https://bugzilla.redhat.com/show_bug.cgi?id=1510059#c303
<adamw> that is the bug which *every single crash of this kind in f27* is currently considered a duplicate of by libreport
<halfline> mcatanzaro: i'd rather miss an occasional bug than get flooded with noise
<mcatanzaro> Clearly something needs to change, but it could just as easily be handled by ABRT
<halfline> doing what ?
<mcatanzaro> I guess making any changes to ABRT is probably too much to expect, though
<halfline> what change would you propose to make to abrt ?
<mcatanzaro> halfline: ABRT has logic somewhere to ignore expected crashes like this
<halfline> why would that be better?
<adamw> i have filed a satyr issue on this too
<halfline> if it's ignoring them
<halfline> versus them not happening ?
<mcatanzaro> I assume it could still count them, but not open a bunch of bugzilla bugs.
<adamw> but yeah, i agree with halfline, it doesn't seem obviously better to abort and then make libreport ignore the abort, versus just exiting
<mcatanzaro> Then if the count goes way up, we can say: hmmm, problem.
<halfline> mcatanzaro: what would the count tell you?
<halfline> yea but what problem?
<halfline> more likely the problem is Xwayland is crashing
<halfline> or something
<halfline> the count doesn't really help you
<halfline> since the Xwayland crash will get shown separately
<halfline> unless you're saying you look at the number of xwayland crashes and the count and see if tehre's a big discrepency ?
<halfline> we had a similar issue with gtk a while back btw
<adamw> yeah, the only problem i can see is if we for some reason *don't* get the xwayland crashes reported
<mcatanzaro> If Xwayland ever dies without leaving a core dump, or ABRT refuses to report the crash for whatever reason ("this backtrace is unusable" being a common culprit), then the XWayland crash won't be reported... anyway, it's fine either way, I'm just observing that we would have had a ton of trouble with this recent WebKit issue had we disabled the client process crash
<halfline> every time the display server went down every application would spam the log with a message saying as much
* adamw goes to look at xwayland crash reports, for that mayyer.
<halfline> totally not useful to see 50 apps all say "session is over" at the same time
<adamw> oh, yeah, we still get that with gnome :P
<adamw> but that's "just" logspam, at least it doesn't affect bugzilla.
<mcatanzaro> Ah good point, I forgot this happened once for every single application....
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: wip/halfline/silence-x-io-errors