Re: [ksmbd] racy uses of ->d_parent and ->d_name

Namjae Jeon <linkinjeon@xxxxxxxxxx> · Thu, 3 Feb 2022 08:16:21 +0900



2022-01-31 10:57 GMT+09:00, Al Viro <viro@xxxxxxxxxxxxxxxxxx>:
> 	Folks, ->d_name and ->d_parent are *NOT* stable unless the
> appropriate locks are held.  In particular, locking a directory that
> might not be our parent is obviously not going to prevent anything.
> Even if it had been our parent at some earlier point.
Hi Al,

First, Thanks for pointing that out!
>
> 	->d_lock would suffice, but it can't be held over blocking
> operation and it can't be held over dcache lookups anyway (instant
> deadlocks).  IOW, the following is racy:
>
> int ksmbd_vfs_lock_parent(struct user_namespace *user_ns, struct dentry
> *parent,
>                           struct dentry *child)
> {
>         struct dentry *dentry;
>         int ret = 0;
>
>         inode_lock_nested(d_inode(parent), I_MUTEX_PARENT);
>         dentry = lookup_one(user_ns, child->d_name.name, parent,
>                             child->d_name.len);
>         if (IS_ERR(dentry)) {
>                 ret = PTR_ERR(dentry);
>                 goto out_err;
>         }
>
>         if (dentry != child) {
>                 ret = -ESTALE;
>                 dput(dentry);
>                 goto out_err;
>         }
>
>         dput(dentry);
>         return 0;
> out_err:
>         inode_unlock(d_inode(parent));
>         return ret;
> }
>
>
> 	Some of that might be fixable - verifying that ->d_parent points
> to parent immediately after inode_lock would stabilize ->d_name in case
> of match.  However, a quick look through the callers shows e.g. this:
>                 write_lock(&ci->m_lock);
>                 if (ci->m_flags & (S_DEL_ON_CLS | S_DEL_PENDING)) {
>                         dentry = filp->f_path.dentry;
>                         dir = dentry->d_parent;
>                         ci->m_flags &= ~(S_DEL_ON_CLS | S_DEL_PENDING);
>                         write_unlock(&ci->m_lock);
>                         ksmbd_vfs_unlink(file_mnt_user_ns(filp), dir,
> dentry);
>                         write_lock(&ci->m_lock);
>                 }
>                 write_unlock(&ci->m_lock);
>
> 	What's to keep dir from getting freed right under us, just as
> ksmbd_vfs_lock_parent() (from ksmbd_vfs_unlink()) tries to grab ->i_rwsem
> on its inode?
Right. We need to get parent using dget_parent().
>
> 	Have the file moved to other directory and apply memory pressure.
> What's to prevent dir from being evicted, its memory recycled, etc.?
Let me check it.
>
> 	For another fun example, consider e.g. smb2_rename():
>                 if (file_present &&
>                     strncmp(old_name, path.dentry->d_name.name,
> strlen(old_name))) {
>                         rc = -EEXIST;
>                         ksmbd_debug(SMB,
>                                     "cannot rename already existing
> file\n");
>                         goto out;
>                 }
>
> Suppose path.dentry has a name longer than 32 bytes (i.e. too large to
> fit into ->d_iname and thus allocated separately).  At this point you
> are not holding any locks (otherwise ksmbd_vfs_fp_rename() immediately
> downstream would deadlock).  So what's to prevent rename(2) on host
> ending up with path.dentry getting renamed and old name getting freed?
>
> 	More of the same: ksmbd_vfs_fp_rename().  In this one
> dget_parent() will at least avoid parent getting freed.  It won't do
> a damn thing to stabilize src_dent->d_name after lock_rename(),
> though, since we are not guaranteed that the thing we locked is
> still the parent...
Okay, I will check it.
>
> 	Why is so much tied to "open, then figure out where it is" model?
> Is it a legacy of userland implementation, or a network fs protocol that
> manages to outsuck NFS, or...?
It need to use absolute based path given from request.

Thanks!
>