On Fri, Dec 06, 2024 at 04:25:13PM +0100, Christian Brauner wrote: > So far we've been using an idr to track pidfs inodes. For some time now > each struct pid has a unique 64bit value that is used as the inode > number on 64 bit. That unique inode couldn't be used for looking up a > specific struct pid though. > > Now that we support file handles we need this ability while avoiding to > leak actual pid identifiers into userspace which can be problematic in > containers. > > So far I had used an idr-based mechanism where the idr is used to > generate a 32 bit number and each time it wraps we increment an upper > bit value and generate a unique 64 bit value. The lower 32 bits are used > to lookup the pid. > > I've been looking at the maple tree because it now has > mas_alloc_cyclic(). Since it uses unsigned long it would simplify the > 64bit implementation and its dense node mode supposedly also helps to > mitigate fragmentation. > > Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx> > --- > Hey Liam, > > Could you look this over and tell me whether my port makes any sense and > is safe? This is the first time I've been playing with the maple tree. > > I've also considerd preallocating the node during pid allocation outside > of the spinlock using mas_preallocate() similar to how idr_preload() > works but I'm unclear how the mas_preallocate() api is supposed to > work in this case. > > For the pidfs inode maple tree we use an external lock for add and > remove. While looking at the maple_tree code I saw that mas_nomem() > is called in mas_alloc_cyclic(). And mas_nomem() has a > __must_hold(mas->tree->ma_lock) annotation. But then the code checks > mt_external_lock() which is placed in a union with ma_lock iirc. So that > annotation seems wrong? > > bool mas_nomem(struct ma_state *mas, gfp_t gfp) > __must_hold(mas->tree->ma_lock) > { > if (likely(mas->node != MA_ERROR(-ENOMEM))) > return false; > > if (gfpflags_allow_blocking(gfp) && !mt_external_lock(mas->tree)) { > mtree_unlock(mas->tree); > mas_alloc_nodes(mas, gfp); > mtree_lock(mas->tree); > } else { > mas_alloc_nodes(mas, gfp); > } > > if (!mas_allocated(mas)) > return false; > > mas->status = ma_start; > return true; > } > > If you want to look at this in context I would please ask you to pull: > > https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs-6.14.pidfs Sorry, I meant work.pidfs.maple_tree.