Re: [PATCH] drm/i915: Switch obj->mm.lock lockdep annotations on its head

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 05, 2019 at 10:49:41AM +0000, Matthew Auld wrote:
> On Tue, 5 Nov 2019 at 09:01, Daniel Vetter <daniel.vetter@xxxxxxxx> wrote:
> >
> > The trouble with having a plain nesting flag for locks which do not
> > naturally nest (unlike block devices and their partitions, which is
> > the original motivation for nesting levels) is that lockdep will
> > never spot a true deadlock if you screw up.
> >
> > This patch is an attempt at trying better, by highlighting a bit more
> > the actual nature of the nesting that's going on. Essentially we have
> > two kinds of objects:
> >
> > - objects without pages allocated, which cannot be on any lru and are
> >   hence inaccessible to the shrinker.
> >
> > - objects which have pages allocated, which are on an lru, and which
> >   the shrinker can decide to throw out.
> >
> > For the former type of object, memory allcoations while holding
> > obj->mm.lock are permissible. For the latter they are not. And
> > get/put_pages transitions between the two types of objects.
> >
> > This is still not entirely fool-proof since the rules might chance.
> > But as long as we run such a code ever at runtime lockdep should be
> > able to observe the inconsistency and complain (like with any other
> > lockdep class that we've split up in multiple classes). But there are
> > a few clear benefits:
> >
> > - We can drop the nesting flag parameter from
> >   __i915_gem_object_put_pages, because that function by definition is
> >   never going allocate memory, and calling it on an object which
> >   doesn't have its pages allocated would be a bug.
> >
> > - We strictly catch more bugs, since there's not only one place in the
> >   entire tree which is annotated with the special class. All the
> >   other places that had explicit lockdep nesting annotations we're now
> >   going to leave up to lockdep again.
> >
> > - Specifically this catches stuff like calling get_pages from
> >   put_pages (which isn't really a good idea, if we can call get_pages
> >   so could the shrinker). I've seen patches do exactly that.
> >
> > Of course I fully expect CI will show me for the fool I am with this
> > one here :-)
> >
> > v2: There can only be one (lockdep only has a cache for the first
> > subclass, not for deeper ones, and we don't want to make these locks
> > even slower). Still separate enums for better documentation.
> >
> > Real fix: don forget about phys objs and pin_map(), and fix the
> > shrinker to have the right annotations ... silly me.
> >
> > v3: Forgot usertptr too ...
> >
> > v4: Improve comment for pages_pin_count, drop the IMPORTANT comment
> > and instead prime lockdep (Chris).
> >
> > v5: Appease checkpatch, no double empty lines (Chris)
> >
> > v6: More rebasing over selftest changes. Also somehow I forgot to
> > push this patch :-/
> >
> > Also format comments consistently while at it.
> >
> > v7: Fix typo in commit message (Joonas)
> >
> > Also drop the priming, with the lmem merge we now have allocations
> > while holding the lmem lock, which wreaks the generic priming I've
> > done in earlier patches. Should probably be resurrected when lmem is
> > fixed. See
> >
> > commit 232a6ebae419193f5b8da4fa869ae5089ab105c2
> > Author: Matthew Auld <matthew.auld@xxxxxxxxx>
> > Date:   Tue Oct 8 17:01:14 2019 +0100
> >
> >     drm/i915: introduce intel_memory_region
> >
> > I'm keeping the priming patch locally so it wont get lost.
> 
> Any idea how we can fix this? AFAIK for something like LMEM, its
> objects are always marked as !shrinkable, and so shouldn't be
> accessible from the shrinker.

On one hand I don't think you need to fix this, since it works.

Otoh I think it's generally good practice to not allocate memory (or at
least be very conscious about it) when holding memory manager locks.
Because sooner or later you somehow create a dependency from one memory
manager to the next, or something else, and then you end up with the
shrinker in your dependencies. In the locking rules for the new lmem
locking we've discussed this a bit, and agreed to just encode it as best
practice. Including lockdep priming (i.e. tell lockdep that we might get
at mm_lock from fs_reclaim, to make sure no one can allocate anything
while holding mm_lock).

Wrt fixing: preallocate, then take lock, is the standard pattern.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux