On Fri 09-06-17 12:03:47, Chris Wilson wrote: > I tried __GFP_NORETRY in the belief that __GFP_RECLAIM was effective. It > struggles with handling reclaim of our dirty buffers and relies on > reclaim via kswapd. As a result, a single pass of direct reclaim is > unreliable when i915 occupies the majority of available memory, and the > only means of effectively waiting on kswapd to amke progress is by not > setting the __GFP_NORETRY flag and lopping. That leaves us with the > dilemma of invoking the oomkiller instead of propagating the allocation > failure back to userspace where it can be handled more gracefully (one > hopes). In the future we may have __GFP_MAYFAIL to allow repeats up until > we genuinely run out of memory and the oomkiller would have been invoked. > Until then, let the oomkiller wreck havoc. > > v2: Stop playing with side-effects of gfp flags and await __GFP_MAYFAIL > v3: Update comments that direct reclaim only appears to be ignoring our > dirty buffers! > > Fixes: 24f8e00a8a2e ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations") > Testcase: igt/gem_tiled_swapping > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxx> OK, this looks good to me. I will follow with my __GFP_RETRY_MAYFAIL and will convert this caller to use it sometimes next week. Thanks! > --- > drivers/gpu/drm/i915/i915_gem.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index ca61a0be1458..b22145a876c5 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2404,7 +2404,20 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) > if (!*s) { > /* reclaim and warn, but no oom */ > gfp = mapping_gfp_mask(mapping); > - gfp |= __GFP_NORETRY; > + > + /* Our bo are always dirty and so we require > + * kswapd to reclaim our pages (direct reclaim > + * does not effectively begin pageout of our > + * buffers on its own). However, direct reclaim > + * only waits for kswapd when under allocation > + * congestion. So as a result __GFP_RECLAIM is > + * unreliable and fails to actually reclaim our > + * dirty pages -- unless you try over and over > + * again with !__GFP_NORETRY. However, we still > + * want to fail this allocation rather than > + * trigger the out-of-memory killer and for > + * this we want the future __GFP_MAYFAIL. > + */ > } > } while (1); > > -- > 2.11.0 > -- Michal Hocko SUSE Labs _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx