[PATCH RFC] pidfs: use maple tree
Christian Brauner
brauner at kernel.org
Fri Dec 6 07:31:21 PST 2024
On Fri, Dec 06, 2024 at 04:25:13PM +0100, Christian Brauner wrote:
> So far we've been using an idr to track pidfs inodes. For some time now
> each struct pid has a unique 64bit value that is used as the inode
> number on 64 bit. That unique inode couldn't be used for looking up a
> specific struct pid though.
>
> Now that we support file handles we need this ability while avoiding to
> leak actual pid identifiers into userspace which can be problematic in
> containers.
>
> So far I had used an idr-based mechanism where the idr is used to
> generate a 32 bit number and each time it wraps we increment an upper
> bit value and generate a unique 64 bit value. The lower 32 bits are used
> to lookup the pid.
>
> I've been looking at the maple tree because it now has
> mas_alloc_cyclic(). Since it uses unsigned long it would simplify the
> 64bit implementation and its dense node mode supposedly also helps to
> mitigate fragmentation.
>
> Signed-off-by: Christian Brauner <brauner at kernel.org>
> ---
> Hey Liam,
>
> Could you look this over and tell me whether my port makes any sense and
> is safe? This is the first time I've been playing with the maple tree.
>
> I've also considerd preallocating the node during pid allocation outside
> of the spinlock using mas_preallocate() similar to how idr_preload()
> works but I'm unclear how the mas_preallocate() api is supposed to
> work in this case.
>
> For the pidfs inode maple tree we use an external lock for add and
> remove. While looking at the maple_tree code I saw that mas_nomem()
> is called in mas_alloc_cyclic(). And mas_nomem() has a
> __must_hold(mas->tree->ma_lock) annotation. But then the code checks
> mt_external_lock() which is placed in a union with ma_lock iirc. So that
> annotation seems wrong?
>
> bool mas_nomem(struct ma_state *mas, gfp_t gfp)
> __must_hold(mas->tree->ma_lock)
> {
> if (likely(mas->node != MA_ERROR(-ENOMEM)))
> return false;
>
> if (gfpflags_allow_blocking(gfp) && !mt_external_lock(mas->tree)) {
> mtree_unlock(mas->tree);
> mas_alloc_nodes(mas, gfp);
> mtree_lock(mas->tree);
> } else {
> mas_alloc_nodes(mas, gfp);
> }
>
> if (!mas_allocated(mas))
> return false;
>
> mas->status = ma_start;
> return true;
> }
>
> If you want to look at this in context I would please ask you to pull:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs-6.14.pidfs
Sorry, I meant work.pidfs.maple_tree.
More information about the maple-tree
mailing list