On 2023/10/31 11:02, Linus Torvalds wrote:
On Mon, 30 Oct 2023 at 16:25, Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
[ Looks around. Oh. Except we have lockref_put_return() in fs/erofs/
too, and that looks completely bogus, since it doesn't check the
return value! ]
74 struct erofs_workgroup *erofs_insert_workgroup(struct super_block *sb,
75 struct erofs_workgroup *grp)
76 {
77 struct erofs_sb_info *const sbi = EROFS_SB(sb);
78 struct erofs_workgroup *pre;
79
80 /*
81 * Bump up before making this visible to others for the XArray in order
82 * to avoid potential UAF without serialized by xa_lock.
83 */
84 lockref_get(&grp->lockref);
85
86 repeat:
87 xa_lock(&sbi->managed_pslots);
88 pre = __xa_cmpxchg(&sbi->managed_pslots, grp->index,
89 NULL, grp, GFP_NOFS);
90 if (pre) {
91 if (xa_is_err(pre)) {
92 pre = ERR_PTR(xa_err(pre));
93 } else if (!erofs_workgroup_get(pre)) {
94 /* try to legitimize the current in-tree one */
95 xa_unlock(&sbi->managed_pslots);
96 cond_resched();
97 goto repeat;
98 }
99 lockref_put_return(&grp->lockref);
This line it just decreases the reference count just bumpped up at the
line 84 (and it will always succeed).
You have two possible scenarios:
- it doesn't always succeed, because somebody else has the lock on
the grp->lockref right now, or because lockref doesn't do any
optimized cases at all
- nobody else can access grp->lockref at the same time, so the lock
is pointless, so you shouldn't be using lockref in the first place,
and certainly not lockref_put_return
Yeah, the second case is the real use case here.
IOW, I don't see how lockref_put_return() could possibly *ever* be the
right thing to do.
The thing is, lockref_put_return() is fundamentally designed to be
something that can fail.
In fact, in some situations it will *always* fail. Check this out:
#define USE_CMPXCHG_LOCKREF \
(IS_ENABLED(CONFIG_ARCH_USE_CMPXCHG_LOCKREF) && \
IS_ENABLED(CONFIG_SMP) && SPINLOCK_SIZE <= 4)
...
#if USE_CMPXCHG_LOCKREF
...
#else
#define CMPXCHG_LOOP(CODE, SUCCESS) do { } while (0)
#endif
...
int lockref_put_return(struct lockref *lockref)
{
CMPXCHG_LOOP(
new.count--;
if (old.count <= 0)
return -1;
,
return new.count;
);
return -1;
}
look, if USE_CMPXCHG_LOCKREF is false (on UP, or if spinlock are big
because of spinlock debugging, or whatever), lockref_put_return() will
*always* fail, expecting the caller to deal with that failure.
So doing a lockref_put_return() without dealing with the failure case
is FUNDAMENTALLY BROKEN.
Yeah, thanks for point out, I get it. I think this really needs to be
cleaned up since we don't actually care about locking here (since as I
said it doesn't actually populate to XArray).
Yes, it's an odd function. It's a function that is literally designed
for that dcache use-case, where we have a fast-path and a slow path,
and the "lockref_put_return() fails" is the slow-path that needs to
take the spinlock and do it carefully.
You *cannot* use that function without failure handling. Really.
I will fix+cleanup this path later and send upstream. Thanks for the
heads up.
Thanks,
Gao Xiang
Linus