So there's something wrong with the memory allocation changes since 4.11, which seem to be mostly credited to Chris Wilson. In particular, I got this earlier today: Xorg: page allocation failure: order:0, mode:0x14210d2(GFP_HIGHUSER|__GFP_NORETRY|__GFP_RECLAIMABLE), nodemask=(null) and then soon afterwards in the log I see chrome[13102]: segfault at 968 ip 00007f472a7fda83 sp 00007fffab9a6ef0 error 4 in libX11.so.6.3.0[7f472a7d1000+138000] gnome-session-f[13115]: segfault at 0 ip 00007f7e765ab4b9 sp 00007ffca5990470 error 4 in libgtk-3.so.0.2200.15[7f7e762cc000+6f9000] which I assume is related to broken error handling. So there's at least two bugs here: (a) order-0 memory allocation failure is generally a sign of something bad. We clearly give up *much* too easily. (b) using __GFP_NORETRY and wanting the memory failure, but then not using __GFP_NOWARN is just stupid. Now, (b) initially made me go "I'll just add that __GFP_NOWARN myself". Because it's true - if you intentionally tell the VM subsystem that you'd rather get a failed allocation than try a bit harder, then you obviously shouldn't get the warning either. I think the VM people have talked about just considering NORETRY to imply NOWARN. However, the fact that this actually caused problems in downstream user space, and the fact that this happened with an order-0 allocation made me re-consider. That allocation clearly *is* important, and returning NULL may "work" from a kernel standpoint, but it sure as hell didn't work in the bigger picture, now did it? So the warning was actually good in this case. This may in fact be an example of why GFP_NORETRY should *not* imply NOWARN. So instead of shutting up the warning, I pass it over to the i915 people. Making that allocation fail easily wasn't such a great idea after all, was it? Maybe that NORETRY should be reconsidered, at least for important (perhaps small?) allocations? Also adding some VM people, because I think it's ridiculous that the 0-order allocation failed in the first place. Full report attached, there's tons of memory that should have been trivial to free. So I suspect GFP_NORETRY ends up being *much* too aggressive, and basically doesn't even try any trivial freeing. Maybe we want some middle ground between "retry forever" and "don't try at all". In people trying to fight the "retry forever", we seem to have gone too far in the "don't even bother, just return NULL" direction. Added a random mixture of i915 and MM people. Feel free to send this message further if you feel I missed somebody, Linus
Attachment:
gfp-noretry
Description: Binary data
_______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx