Re: [RFC] simplifying fast_dput(), dentry_kill() et.al.

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Mon, 30 Oct 2023 17:02:32 -1000

On Mon, 30 Oct 2023 at 16:25, Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
>
> >
> > [ Looks around. Oh. Except we have lockref_put_return() in fs/erofs/
> > too, and that looks completely bogus, since it doesn't check the
> > return value! ]
>
>   74 struct erofs_workgroup *erofs_insert_workgroup(struct super_block *sb,
>   75                                                struct erofs_workgroup *grp)
>   76 {
>   77         struct erofs_sb_info *const sbi = EROFS_SB(sb);
>   78         struct erofs_workgroup *pre;
>   79
>   80         /*
>   81          * Bump up before making this visible to others for the XArray in order
>   82          * to avoid potential UAF without serialized by xa_lock.
>   83          */
>   84         lockref_get(&grp->lockref);
>   85
>   86 repeat:
>   87         xa_lock(&sbi->managed_pslots);
>   88         pre = __xa_cmpxchg(&sbi->managed_pslots, grp->index,
>   89                            NULL, grp, GFP_NOFS);
>   90         if (pre) {
>   91                 if (xa_is_err(pre)) {
>   92                         pre = ERR_PTR(xa_err(pre));
>   93                 } else if (!erofs_workgroup_get(pre)) {
>   94                         /* try to legitimize the current in-tree one */
>   95                         xa_unlock(&sbi->managed_pslots);
>   96                         cond_resched();
>   97                         goto repeat;
>   98                 }
>   99                 lockref_put_return(&grp->lockref);
>
> This line it just decreases the reference count just bumpped up at the
> line 84 (and it will always succeed).

You have two possible scenarios:

 - it doesn't always succeed, because somebody else has the lock on
the grp->lockref right now, or because lockref doesn't do any
optimized cases at all

 - nobody else can access grp->lockref at the same time, so the lock
is pointless, so you shouldn't be using lockref in the first place,
and certainly not lockref_put_return

IOW, I don't see how lockref_put_return() could possibly *ever* be the
right thing to do.

The thing is, lockref_put_return() is fundamentally designed to be
something that can fail.

In  fact, in some situations it will *always* fail. Check this out:

#define USE_CMPXCHG_LOCKREF \
        (IS_ENABLED(CONFIG_ARCH_USE_CMPXCHG_LOCKREF) && \
         IS_ENABLED(CONFIG_SMP) && SPINLOCK_SIZE <= 4)
...
#if USE_CMPXCHG_LOCKREF
...
#else

#define CMPXCHG_LOOP(CODE, SUCCESS) do { } while (0)

#endif
...
int lockref_put_return(struct lockref *lockref)
{
        CMPXCHG_LOOP(
                new.count--;
                if (old.count <= 0)
                        return -1;
        ,
                return new.count;
        );
        return -1;
}

look, if USE_CMPXCHG_LOCKREF is false (on UP, or if spinlock are big
because of spinlock debugging, or whatever), lockref_put_return() will
*always* fail, expecting the caller to deal with that failure.

So doing a lockref_put_return() without dealing with the failure case
is FUNDAMENTALLY BROKEN.

Yes, it's an odd function. It's a function that is literally designed
for that dcache use-case, where we have a fast-path and a slow path,
and the "lockref_put_return() fails" is the slow-path that needs to
take the spinlock and do it carefully.

You *cannot* use that function without failure handling. Really.

                     Linus