Quoting Pavel Machek (2020-08-19 20:33:26) > Hi! > > > > > If we hit an error during construction of the reloc chain, we need to > > > > replace the chain into the next batch with the terminator so that upon > > > > flushing the relocations so far, we do not execute a hanging batch. > > > > > > Thanks for the patches. I assume this should fix problem from > > > "5.9-rc1: graphics regression moved from -next to mainline" thread. > > > > > > I have applied them over current -next, and my machine seems to be > > > working so far (but uptime is less than 30 minutes). > > > > > > If the machine still works tommorow, I'll assume problem is solved. > > > > Aye, best wait until we have to start competing with Chromium for > > memory... The suspicion is that it was the resource allocation failure > > path. > > Yep, my machines are low on memory. > > But ... test did not work that well. I have dead X and blinking > screen. Machine still works reasonably well over ssh, so I guess > that's an improvement. Well my last remaining 32bit gen3 device is currently pushing up the daises, so could you try removing the attempt to use WC? Something like diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 44df98d85b38..b26f7de913c3 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -955,10 +955,7 @@ static u32 *__reloc_gpu_map(struct reloc_cache *cache, { u32 *map; - map = i915_gem_object_pin_map(pool->obj, - cache->has_llc ? - I915_MAP_FORCE_WB : - I915_MAP_FORCE_WC); + map = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WB); on top of the previous patch. Faultinjection didn't turn up anything in eb_relocate_vma, so we need to dig deeper. -Chris