I tried __GFP_NORETRY in the belief that __GFP_RECLAIM was effective. It struggles with handling reclaim via kswapd (through inconsistency within throttle_direct_reclaim() and even then the race between multiple allocators makes the two step of reclaim then allocate fragile), and as our buffers are always dirty (with very few exceptions), we required kswapd to perform pageout on them. The only effective means of waiting on kswapd is to retry the allocations (i.e. not set __GFP_NORETRY). That leaves us with the dilemma of invoking the oomkiller instead of propagating the allocation failure back to userspace where it can be handled more gracefully (one hopes). In the future we may have __GFP_MAYFAIL to allow repeats up until we genuinely run out of memory and the oomkiller would have been invoked. Until then, let the oomkiller wreck havoc. v2: Stop playing with side-effects of gfp flags and await __GFP_MAYFAIL Fixes: 24f8e00a8a2e ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") Testcase: igt/gem_tiled_swapping Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> --- drivers/gpu/drm/i915/i915_gem.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7286f5dd3e64..845df6067e90 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2406,7 +2406,20 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) if (!*s) { /* reclaim and warn, but no oom */ gfp = mapping_gfp_mask(mapping); - gfp |= __GFP_NORETRY; + + /* Our bo are always dirty and so we require + * kswapd to reclaim our pages (direct reclaim + * performs no swapping on its own). However, + * direct reclaim is meant to wait for kswapd + * when under pressure, this is broken. As a + * result __GFP_RECLAIM is unreliable and fails + * to actually reclaim dirty pages -- unless + * you try over and over again with + * !__GFP_NORETRY. However, we still want to + * fail this allocation rather than trigger + * the out-of-memory killer and for this we + * want the future __GFP_MAYFAIL. + */ } } while (1); -- 2.11.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx