Thursday 16 June 2022

Re: systemd-oomd issues on desktop



On Mon, Jun 13 2022 at 14:18:53 +0200, Lukas Märdian <slyon@ubuntu.com> wrote:
Am 10.06.22 um 12:17 schrieb Sebastien Bacher:
Le 10/06/2022 à 11:40, Julian Andres Klode a écrit :
The bug reports we see show that systemd-oomd is working correctly: The browser gets killed, the system remains responsive without having become unresponsive as would be the usual case.
It might be working 'correctly' but is not perceived as such by users. I've seen regular complains from users since the release stating that their browser or code editor got closed in front on them without warning, on a machine they had been using for years with the same configuration and software without issue. They might be getting short in resources but in practice they never experienced a sluggish system due to it and just see the feature as buggy.
I agree with Julian in that systemd-oomd in general seems to be working as expected. Its purpose is all about jumping in _before_ a system reaches its point of no return and unresponsive swapping death. Therefore, I feel like we should not increase the recommended "SwapUsedLimit=90%" default much higher, i.e. option (1), as that could lead to situations where it's already too late to clear some memory and thus defeat the purpose of having sd-oomd. OTOH, receiving those bug reports shows that sd-oomd is not yet properly optimized either, killing people's "important" applications first (such as the browser). Especially, if the browser applies some memory monitoring on its own to discard/unload unused tabs and free up memory, as suggested by Olivier. The option (3) recommended by Nick, could be one viable option in the Ubuntu context (only 1G swap available) for the time being, until we can have a proper upstream solution (using notifications and hooks) [1]. Thanks for bringing this up with the upstream developers, Michel! I wonder if we could use a more selective approach, though, using "OOMScoreAdjust=" in the systemd.exec environment (i.e. Gnome-Shell launcher in Ubuntu's context, as sd-oomd is currently only enabled on Ubuntu Desktop) [2], to reduce the probability of certain "important" apps being killed first, e.g. by maintaining an allow-list of such apps. Of course we do not want to introduce different classes of apps randomly, but would need to come up with a proper policy of which apps would be eligible to have a lower "OOMScoreAdjust" value. I feel like having individual mechanisms on the application layer to keep memory consumption under control, such as a browser's tab unloading, could be a fair eligibility criteria.

I'm not sure the extent to which this would be acceptably backported to 22.04, but I understand that we actually have all the infrastructure in place so that GNOME Shell could adjust the *foreground* application's OOM score.

systemd-oomd has generally been a great improvement on my 16GiB RAM + 2GiB swap laptop, turning ~1/day hard lockups due to OOM conditions into something being relatively-cleanly terminated, but it would be nicer still if it could preferentially kill Evolution chugging along in the background rather than the Firefox window I'm currently using (or terminal window I'm currently compiling in).

That, and a notification that an OOM condition was averted and <APPLICATION> was killed to avoid it would be a significant improvement in behaviour.