On Sat, Jan 4, 2020 at 2:30 PM drago01 <drago01@xxxxxxxxx> wrote: > > On Sat, Jan 4, 2020 at 7:32 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > > > > It might be. And it might need to be tweaked. Perhaps 6% for SIGTERM > > and 3% for SIGKILL. Or even 5% and 2.5%. For sure using a percentage > > of RAM and swap is too simplistic. But it's easy for users to > > understand. Something more sophisticated, based on kernel pressure > > stall information would likely be better, and folks are working on > > that. > > Yes that would be a way better metric than a percent value which is > either to close to full ram or to early if you have lots of ram. > 6% of 4GB is 254MB while for 32GB its almost 2GB - killing processes > while you have 2GB left is just wasteful. If there's a swap device, that won't happen. The case where SIGTERM really happens at 10% RAM free, is when there's no swap device. And even though the no swap device configuration is not a default, and explicitly not recommended, right now, by the installer (as in, if you try to do such an installation, it warns you) - it is a configuration we allow, and I happen to know it's somewhat common among developers with systems with lots of RAM expressly because swap thrashing even to SSD results in such poor UX. Consider the following 'vmstat 10' while doing a compile: procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 6 11 4168060 1821580 40 736604 30234 10841 46533 13805 19230 29799 74 12 1 13 0 At this time, the GUI was completely unresponsive, not even the mouse arrow moves, for about 1 minute. Seemingly plenty of RAM and swap, and idle CPU. But rather heavy swap in and out. 10 9 4459648 200912 40 569260 11218 18856 28846 19997 15164 35256 28 9 9 53 0 6 8 4207328 807092 40 636156 26205 16744 35472 18287 20179 34087 62 12 3 23 0 At these two lines, the mouse arrow is stuttering, the GUI is very sluggish, even unresponsive much of the time. Jan 04 15:37:18 fmac.local earlyoom[4896]: mem avail: 1212 of 7865 MiB (15 %), swap free: 4807 of 8195 MiB (58 %) Near the same time. The system is no where near either RAM or swap exhaustion. But swap si/so are high. This is an SSD BTW. Can I get to the compile and force quit? Eventually, it would take a couple minutes. But good progress is being made with the compile during this whole time. earlyoom doesn't SIGTERM this compile until 20 minutes of this behavior. With default settings. So it really isn't solving the sluggish, stuttering problem. But what does happen, is it SIGTERMs the compile before the system gets to a state where essentially all of the work is only swap in and swap out, and no other work is being done. Here is the output (2 week expiration) https://pastebin.com/0iZHNjg7 Retest with no swap at all, and yes, compile gets a SIGTERM when free memory gets to 10% (because swap is already considered to be 0% free, since it doesn't exist). But also? The system isn't under any swap io duress. The system is completely responsive throughout. This is why we see developers giving up on swap partitions entirely. swap-on-ZRAM might be a compromise. That's related issue #120. > > That's not a fix either, it's a work around that papers over the > > problem. Same as earlyoom, except RAM costs money, and may not be an > > option due to hardware limitations. A modern operating system needs to > > know better than to allow unprivileged processes to take down the > > whole system. > > I think you misunderstood me. Yes the OS should behave better than > this but if you are running a server you don't want your DB, web > server to not be reachable because the system run out of memory - the > only way to "fix" that > is to provide enough resources. No amount of OOM killing would help > you here. The system may be up but not the server process the machine > is running for ... Perhaps, but two points: a. this feature is for Workstation. If the Server working group wants to give it a go, that's up to them. But they may prefer experimenting with more server oriented user space oom daemons like recent versions of oomd. And for that use case, Facebook (and others) have investigated this and find that avoiding OOM even by process killing, is far less bad than the system hanging itself. As in better for recovery and better for limited sysadmin resources. There's a video about it from the recent All Systems Go conference. b. earlyoom does SIGTERM first, I have yet to see a single process (hundreds of tests, but that's really nothing, and also not a scientific sample) that doesn't respond to SIGTERM, where SIGKILL is needed. > > > And btw we should really update the minimum memory requirements in our documentation, the current ones have nothing to do with reality (if you want a pleasant user experience). > > > > Can you be more specific? > > > > On getfedora.org it reads: > > Fedora requires a minimum of 20GB disk, 2GB RAM, to install and run > > successfully. Double those amounts is recommended. > > > I simply do not think 2GB is sufficient, the "recommended double" i.e > 4GB should be the "required" and drop the double part all together. > A modern desktop with apps on top will not run well enough on 2GB, > lets stop pretending it does. But anyways that's off topic as it is > not part of the proposal. Workstation working group recently bumped this from 1G minimum, 2G recommended. We're considering VM's with these numbers. And comparative point of reference, Windows 10 64-bit is also 2G minimum. -- Chris Murphy _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx