Re: kernel BUG when fsync'ing file in a overlayfs merged dir, located on btrfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/5/15 10:18 PM, Al Viro wrote:
> On Thu, Nov 05, 2015 at 09:57:35PM -0500, Jeff Mahoney wrote:
> 
>> So now file_operations callbacks can't assume that file->f_path.dentry
>> belongs to the same file system that implements the callback.  More than
>> that, any code that could ultimately get a dentry that comes from an
>> open file can't trust that it's from the same file system.
> 
> Use file_inode() for inode.
> 
>> This crash is due to this issue.  Unlike xfs and ext2/3/4, we use
>> file->f_path.dentry->d_inode to resolve the inode.  Using file_inode()
>> is an easy enough fix here, but we run into trouble later.  We have
>> logic in the btrfs fsync() call path (check_parent_dirs_for_sync) that
>> walks back up the dentry chain examining the inode's last transaction
>> and last unlink transaction to determine whether a full transaction
>> commit is required.  This obviously doesn't work if we're walking the
>> overlayfs path instead.  Regardless of any argument over whether that's
>> doing the right thing, it's a pretty common pattern to assume that
>> file->f_path.dentry comes from the same file system when using a
>> file_operation.  Is it intended that that assumption is no longer valid?
> 
> It's actually rare, and your example is a perfect demonstration of the
> reasons why it is so rare.  What's to protect btrfs_log_dentry_safe()
> from racing with rename(2)?  Sure, you do dget_parent().  Which protects
> you from having one-time parent dentry freed under you.  What it doesn't
> do is making any promises about its relationship with your file.

I suppose the irony here is that, AFAIK, that code is to ensure a file
doesn't get lost between transactions due to rename.

Isn't the file->f_path.dentry relationship stable otherwise, though? The
name might change and the parent might change but the dentry that the
file points to won't.

I did find a few other places where that assumption happens without any
questionable traversals.  Sure, all three are in file systems unlikely
to be used with overlayfs.

ocfs2_prepare_inode_for_write uses file->f_path.dentry for
should_remove_suid (due to needing to do it early since cluster locking
is unknown in setattr, according to the commit).  Having
should_remove_suid operate on an inode would solve that easily.

fat_ioctl_set_attributes uses it to call fat_setattr, but that only uses
the inode and could have the inode_operation use a wrapper.

cifs_new_fileinfo keeps a reference to the dentry but it seems to be
used mostly to access the inode except for the nasty-looking call to
build_path_from_dentry in cifs_reopen_file, which I won't be touching.
That does look like a questionable traversal, especially with the "we
can't take the rename lock here" comment.

-Jeff

-- 
Jeff Mahoney

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux