On Fri, Apr 25, 2014 at 10:18:57AM -0700, Dave Hansen wrote: > On 04/25/2014 12:23 AM, Chris Wilson wrote: > > On Thu, Apr 24, 2014 at 03:35:47PM -0700, Dave Hansen wrote: > >> On 04/24/2014 08:39 AM, Chris Wilson wrote: > >>> On Thu, Apr 24, 2014 at 08:21:58AM -0700, Dave Hansen wrote: > >>>> Is it possible that there's still a get_page() reference that's holding > >>>> those pages in place from the graphics code? > >>> > >>> Not from i915.ko. The last resort of our shrinker is to drop all page > >>> refs held by the GPU, which is invoked if we are asked to free memory > >>> and we have no inactive objects left. > >> > >> How sure are we that this was performed before the OOM? > > > > Only by virtue of how shrink_slabs() works. > > Could we try to raise the level of assurance there, please? :) > > So this "last resort" is i915_gem_shrink_all()? It seems like we might > have some problems getting down to that part of the code if we have > problems getting the mutex. In general, but not in this example where the load is tightly controlled. > We have tracepoints for the shrinkers in here (it says slab, but it's > all the shrinkers, I checked): > > /sys/kernel/debug/tracing/events/vmscan/mm_shrink_slab_*/enable > and another for OOMs: > /sys/kernel/debug/tracing/events/oom/enable > > Could you collect a trace during one of these OOM events and see what > the i915 shrinker is doing? Just enable those two and then collect a > copy of: > > /sys/kernel/debug/tracing/trace > > That'll give us some insight about how well the shrinker is working. If > the VM gave up on calling in to it, it might reveal why we didn't get > all the way down in to i915_gem_shrink_all(). I'll add it to the list for QA to try. > > Thanks for the pointer to > > register_oom_notifier(), I can use that to make sure that we do purge > > everything from the GPU, and do a sanity check at the same time, before > > we start killing processes. > > Actually, that one doesn't get called until we're *SURE* we are going to > OOM. Any action taken in there won't be taken in to account. blocking_notifier_call_chain(&oom_notify_list, 0, &freed); if (freed > 0) /* Got some memory back in the last second. */ return; That looks like it should abort the oom and so repeat the allocation attempt? Or is that too hopeful? > >> Also, forgive me for being an idiot wrt the way graphics work, but are > >> there any good candidates that you can think of that could be holding a > >> reference? I've honestly never seen an OOM like this. > > > > Here the only place that we take a page reference is in > > i915_gem_object_get_pages(). We do this when we first bind the pages > > into the GPU's translation table, but we only release the pages once the > > object is destroyed or the system experiences memory pressure. (Once the > > GPU touches the pages, we no longer consider them to be cache coherent > > with the CPU and so migrating them between the GPU and CPU requires > > clflushing, which is expensive.) > > > > Aside from CPU mmaps of the shmemfs filp, all operations on our > > graphical objects should lead to i915_gem_object_get_pages(). However > > not all objects are recoverable as some may be pinned due to hardware > > access. > > In that oom callback, could you dump out the aggregate number of > obj->pages_pin_count across all the objects? That would be a very > interesting piece of information to have. It would also be very > insightful for folks who see OOMs in practice with i915 in their systems. Indeed. -Chris -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>