On 11/5/15 10:18 PM, Al Viro wrote: > On Thu, Nov 05, 2015 at 09:57:35PM -0500, Jeff Mahoney wrote: > >> So now file_operations callbacks can't assume that file->f_path.dentry >> belongs to the same file system that implements the callback. More than >> that, any code that could ultimately get a dentry that comes from an >> open file can't trust that it's from the same file system. > > Use file_inode() for inode. > >> This crash is due to this issue. Unlike xfs and ext2/3/4, we use >> file->f_path.dentry->d_inode to resolve the inode. Using file_inode() >> is an easy enough fix here, but we run into trouble later. We have >> logic in the btrfs fsync() call path (check_parent_dirs_for_sync) that >> walks back up the dentry chain examining the inode's last transaction >> and last unlink transaction to determine whether a full transaction >> commit is required. This obviously doesn't work if we're walking the >> overlayfs path instead. Regardless of any argument over whether that's >> doing the right thing, it's a pretty common pattern to assume that >> file->f_path.dentry comes from the same file system when using a >> file_operation. Is it intended that that assumption is no longer valid? > > It's actually rare, and your example is a perfect demonstration of the > reasons why it is so rare. What's to protect btrfs_log_dentry_safe() > from racing with rename(2)? Sure, you do dget_parent(). Which protects > you from having one-time parent dentry freed under you. What it doesn't > do is making any promises about its relationship with your file. I suppose the irony here is that, AFAIK, that code is to ensure a file doesn't get lost between transactions due to rename. Isn't the file->f_path.dentry relationship stable otherwise, though? The name might change and the parent might change but the dentry that the file points to won't. I did find a few other places where that assumption happens without any questionable traversals. Sure, all three are in file systems unlikely to be used with overlayfs. ocfs2_prepare_inode_for_write uses file->f_path.dentry for should_remove_suid (due to needing to do it early since cluster locking is unknown in setattr, according to the commit). Having should_remove_suid operate on an inode would solve that easily. fat_ioctl_set_attributes uses it to call fat_setattr, but that only uses the inode and could have the inode_operation use a wrapper. cifs_new_fileinfo keeps a reference to the dentry but it seems to be used mostly to access the inode except for the nasty-looking call to build_path_from_dentry in cifs_reopen_file, which I won't be touching. That does look like a questionable traversal, especially with the "we can't take the rename lock here" comment. -Jeff -- Jeff Mahoney
Attachment:
signature.asc
Description: OpenPGP digital signature