On 3/24/16 11:20 AM, Al Viro wrote: > On Thu, Nov 05, 2015 at 11:03:58PM -0500, Jeff Mahoney wrote: > >> I suppose the irony here is that, AFAIK, that code is to ensure a file >> doesn't get lost between transactions due to rename. >> >> Isn't the file->f_path.dentry relationship stable otherwise, though? The >> name might change and the parent might change but the dentry that the >> file points to won't. > > Sure, it is stable. But that doesn't guarantee anything about the > ancestors. > > btrfs_log_inode_parent() is a mess. Look at that loop you've got there: > > while (1) { > if (!parent || d_really_is_negative(parent) || sb != d_inode(parent)->i_sb) > break; > > Really? NULL from dget_parent()? A negative from dget_parent()? Sodding > superblock changing from transition to parent? > > inode = d_inode(parent); > if (root != BTRFS_I(inode)->root) > break; > > Umm... Something like crossing a snapshot boundary? > > if (BTRFS_I(inode)->generation > last_committed) { > ret = btrfs_log_inode(trans, root, inode, > LOG_INODE_EXISTS, > 0, LLONG_MAX, ctx); > if (ret) > goto end_trans; > } > if (IS_ROOT(parent)) > break; > > parent = dget_parent(parent); > dput(old_parent); > old_parent = parent; > } > > Note that there's not a damn thing to guarantee that those directories have > anything to do with your file - race with rename(2) easily could've sent you > chasing the wrong chain of ancestors. Incidentally, what will happen if > you do that fsync() to an unlinked file? It still has ->d_parent pointing > to the what used to be its parent (quite possibly itself already rmdir'ed and > pinned down by that reference; inode is still busy, as if we'd done open and > rmdir). Is that what those checks are attempting to catch? And what happens > if rmdir happens between the check and btrfs_log_inode()? > > The thing is, playing with ancestors of opened file is very easy to get > wrong, regardless of overlayfs involvement. And this code is anything > but clear. I don't know btrfs guts enough to be able to tell how much > can actually break (as opposed to adding wrong inodes to transaction), > but I would really suggest taking a good look at what's going on there... Yeah, absolutely. The file_dentry() patch that you and Miklos worked out the other day fixes the fsync crashes for us when we use it in btrfs_file_sync but you're 100% correct in your opinion of this code. >> I did find a few other places where that assumption happens without any >> questionable traversals. Sure, all three are in file systems unlikely >> to be used with overlayfs. >> >> ocfs2_prepare_inode_for_write uses file->f_path.dentry for >> should_remove_suid (due to needing to do it early since cluster locking >> is unknown in setattr, according to the commit). Having >> should_remove_suid operate on an inode would solve that easily. > > Can't do - there are filesystems that _need_ dentry for ->setattr(). > > > Grr... Sorry, misread what you'd written. should_remove_suid() > ought to be switched to inode, indeed. Ok, I'll write that up. >> cifs_new_fileinfo keeps a reference to the dentry but it seems to be >> used mostly to access the inode except for the nasty-looking call to >> build_path_from_dentry in cifs_reopen_file, which I won't be touching. >> That does look like a questionable traversal, especially with the "we >> can't take the rename lock here" comment. > > cifs uses fs-root-relative pathnames in protocol; nothing to do there. > Where do you see that comment, BTW? It uses read_seqbegin/read_seqretry > on rename_lock there... I must've been looking at old code. I don't see it now either. -Jeff -- Jeff Mahoney
Attachment:
signature.asc
Description: OpenPGP digital signature