After mnt_hold_writers() has been called we will always have set MNT_WRITE_HOLD and consequently we always need to pair mnt_hold_writers() with mnt_unhold_writers(). After the recent cleanup in [1] where Al switched from a do-while to a for loop the cleanup currently fails to unset MNT_WRITE_HOLD for the first mount that was changed. Fix this and make sure that the first mount will be cleaned up and add some comments to make it more obvious. Reported-by: syzbot+10a16d1c43580983f6a2@xxxxxxxxxxxxxxxxxxxxxxxxx Reported-by: syzbot+306090cfa3294f0bbfb3@xxxxxxxxxxxxxxxxxxxxxxxxx Fixes: e257039f0fc7 ("mount_setattr(): clean the control flow and calling conventions") [1] Link: https://lore.kernel.org/lkml/0000000000007cc21d05dd0432b8@xxxxxxxxxx Link: https://lore.kernel.org/lkml/00000000000080e10e05dd043247@xxxxxxxxxx Cc: Hillf Danton <hdanton@xxxxxxxx> Cc: Christoph Hellwig <hch@xxxxxx> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Signed-off-by: Christian Brauner (Microsoft) <brauner@xxxxxxxxxx> --- This should fix the syzbot issue. This is only relevant for making a mount or mount tree read-only: 1. successul recursive read-only mount tree change: Cleanup loop isn't executed. 2. failed recursive read-only mount tree change: m will point to the mount we failed to handle. The cleanup loop will run until p == m and then terminate. 3. successful single read-only mount change: Cleanup loop won't be executed. 4. failed single read-only mount change: m will point to mnt and the cleanup loop will terminate if p == m. I don't think there's any other weird corner cases since we now that MNT_WRITE_HOLD can only have been set by us as it requires lock_mount_hash() which we hold. So unconditionally unsetting it is fine. But please make sure to take a close look at the changed loop. --- fs/namespace.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/namespace.c b/fs/namespace.c index a0a36bfa3aa0..afe2b64b14f1 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -4058,10 +4058,22 @@ static int mount_setattr_prepare(struct mount_kattr *kattr, struct mount *mnt) if (err) { struct mount *p; - for (p = mnt; p != m; p = next_mnt(p, mnt)) { + /* + * If we had to call mnt_hold_writers() MNT_WRITE_HOLD will + * be set in @mnt_flags. The loop unsets MNT_WRITE_HOLD for all + * mounts and needs to take care to include the first mount. + */ + for (p = mnt; p; p = next_mnt(p, mnt)) { /* If we had to hold writers unblock them. */ if (p->mnt.mnt_flags & MNT_WRITE_HOLD) mnt_unhold_writers(p); + + /* + * We're done once the first mount we changed got + * MNT_WRITE_HOLD unset. + */ + if (p == m) + break; } } return err; base-commit: b2d229d4ddb17db541098b83524d901257e93845 -- 2.32.0