On 2/20/23 16:10, Matthew Wilcox wrote:
On Mon, Feb 20, 2023 at 03:00:59PM +0100, Danilo Krummrich wrote:
On 2/17/23 20:38, Matthew Wilcox wrote:
On Fri, Feb 17, 2023 at 02:44:10PM +0100, Danilo Krummrich wrote:
Generic components making use of the maple tree (such as the
DRM GPUVA Manager) delegate the responsibility of ensuring mutual
exclusion to their users.
While such components could inherit the concept of an external lock,
some users might just serialize the access to the component and hence to
the internal maple tree.
In order to allow such use cases, add a new flag MT_FLAGS_LOCK_NONE to
indicate not to do any internal lockdep checks.
I'm really against this change.
First, we really should check that users have their locking right.
It's bitten us so many times when they get it wrong.
In case of the DRM GPUVA manager, some users might serialize the access to
the GPUVA manager and hence to it's maple tree instances, e.g. through the
drm_gpu_scheduler. In such a case ensuring to hold a lock would be a bit
pointless and I wouldn't really know how to "sell" this to potential users
of the GPUVA manager.
This is why we like people to use the spinlock embedded in the tree.
There's nothing for the user to care about. If the access really is
serialised, acquiring/releasing the uncontended spinlock is a minimal
cost compared to all the other things that will happen while modifying
the tree.
I think as for the users of the GPUVA manager we'd have two cases:
1) Accesses to the manager (and hence the tree) are serialized, no lock
needed.
2) Multiple operations on the tree must be locked in order to make them
appear atomic.
In either case the embedded spinlock wouldn't be useful, we'd either
need an external lock or no lock at all.
If there are any internal reasons why specific tree operations must be
mutually excluded (such as those you explain below), wouldn't it make
more sense to always have the internal lock and, optionally, allow users
to specify an external lock additionally?
Second, having a lock allows us to defragment the slab cache. The
patches to do that haven't gone anywhere recently, but if we drop the
requirement now, we'll never be able to compact ranges of memory that
have slabs allocated to them.
Not sure if I get that, do you mind explaining a bit how this would affect
other users of the maple tree, such as my use case, the GPUVA manager?
When we want to free a slab in order to defragment memory, we need
to relocate all the objects allocated within that slab. To do that
for the maple tree node cache, for each node in this particular slab,
we'll need to walk up to the top of the tree and lock it. We can then
allocate a new node from a different slab, change the parent to point
to the new node and drop the lock. After an RCU delay, we can free the
slab and create a larger contiguous block of memory.
As I said, this is somewhat hypothetical in that there's no current
code in the tree to reclaim slabs when we're trying to defragment
memory. And that's because it's hard to do. The XArray and maple
tree were designed to make it possible for their slabs.