On Thursday 01 November 2012 16:20:03 Chris Wilson wrote: > On Thu, 1 Nov 2012 09:04:02 -0700, Jesse Barnes <jbarnes at virtuousgeek.org> wrote: > > On Thu, 01 Nov 2012 15:52:23 +0000 > > > > Chris Wilson <chris at chris-wilson.co.uk> wrote: > > > Actually I've justified the blocking here to myself, and prefer it to > > > simply running the crtc->unpin_work. If userspace is swamping the system > > > so badly that we can run the kthreads quick enough, it deserves a stall. > > > Note that the unpin leak is still about the 3rd most common bug in > > > fedora, > > > so this stall will be forced on many machines. > > > > Hm funky, why does Fedora hit it so much? Does some of the GNOME shell > > stuff run unthrottled or something? > > I don't think so. I trust that in Tvrtko's use case, he is not so much as > hogging the GPU as keeping the system as a whole relatively busy. So I > suspect it is more to do with CPU starvation of the kthreads than > anything else. > > Tvrtko, do you have any feeling for why your machine was easily > suspectible to this leak? Are the stalls noticeable and do they affect > your performance targets? We didn't bother looking for any stalls, but for a long time we were occasionally hitting this pin_count BUG i915_gem_object_pin. So it didn't in fact affect our performance targets as much it completely wrecked our system. If this patch causes an occasional stall instead, given that this bug triggers every 3-4 hours of uptime, we are fine with that. If a frame or so is missed every couple hours on low end hardware we don't care that much. More on the actual workload... Only recently we got lucky and found a platform and workload where it happens reliably. And this patch reliably fixes that. In this workload CPU is being loaded 50-60% decoding a movie and rendering it to a full screen window. Our proprietary compositor page flips at 60Hz only, not faster. Together with another small semi-transparent window being rendered on top of the full screen movie. Movie played is a 25fps one, which means the full screen window is damaged 25 out of 60 frames (give or take) which is when we render to our back buffer and page flip at the vsync rate (60Hz). According to intel_gpu_top tool, GPU load is roughly at 40%, apart from the "Framebuffer Compression" metric which is maxed out, if that is one is at all valid. This particular scenario triggers the bug only on two of our Atom based platform both with a NM10/Pineview G/i915 chipset. Tvrtko