On Mon, 30 Oct 2023 at 16:25, Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote: > > > > > [ Looks around. Oh. Except we have lockref_put_return() in fs/erofs/ > > too, and that looks completely bogus, since it doesn't check the > > return value! ] > > 74 struct erofs_workgroup *erofs_insert_workgroup(struct super_block *sb, > 75 struct erofs_workgroup *grp) > 76 { > 77 struct erofs_sb_info *const sbi = EROFS_SB(sb); > 78 struct erofs_workgroup *pre; > 79 > 80 /* > 81 * Bump up before making this visible to others for the XArray in order > 82 * to avoid potential UAF without serialized by xa_lock. > 83 */ > 84 lockref_get(&grp->lockref); > 85 > 86 repeat: > 87 xa_lock(&sbi->managed_pslots); > 88 pre = __xa_cmpxchg(&sbi->managed_pslots, grp->index, > 89 NULL, grp, GFP_NOFS); > 90 if (pre) { > 91 if (xa_is_err(pre)) { > 92 pre = ERR_PTR(xa_err(pre)); > 93 } else if (!erofs_workgroup_get(pre)) { > 94 /* try to legitimize the current in-tree one */ > 95 xa_unlock(&sbi->managed_pslots); > 96 cond_resched(); > 97 goto repeat; > 98 } > 99 lockref_put_return(&grp->lockref); > > This line it just decreases the reference count just bumpped up at the > line 84 (and it will always succeed). You have two possible scenarios: - it doesn't always succeed, because somebody else has the lock on the grp->lockref right now, or because lockref doesn't do any optimized cases at all - nobody else can access grp->lockref at the same time, so the lock is pointless, so you shouldn't be using lockref in the first place, and certainly not lockref_put_return IOW, I don't see how lockref_put_return() could possibly *ever* be the right thing to do. The thing is, lockref_put_return() is fundamentally designed to be something that can fail. In fact, in some situations it will *always* fail. Check this out: #define USE_CMPXCHG_LOCKREF \ (IS_ENABLED(CONFIG_ARCH_USE_CMPXCHG_LOCKREF) && \ IS_ENABLED(CONFIG_SMP) && SPINLOCK_SIZE <= 4) ... #if USE_CMPXCHG_LOCKREF ... #else #define CMPXCHG_LOOP(CODE, SUCCESS) do { } while (0) #endif ... int lockref_put_return(struct lockref *lockref) { CMPXCHG_LOOP( new.count--; if (old.count <= 0) return -1; , return new.count; ); return -1; } look, if USE_CMPXCHG_LOCKREF is false (on UP, or if spinlock are big because of spinlock debugging, or whatever), lockref_put_return() will *always* fail, expecting the caller to deal with that failure. So doing a lockref_put_return() without dealing with the failure case is FUNDAMENTALLY BROKEN. Yes, it's an odd function. It's a function that is literally designed for that dcache use-case, where we have a fast-path and a slow path, and the "lockref_put_return() fails" is the slow-path that needs to take the spinlock and do it carefully. You *cannot* use that function without failure handling. Really. Linus