On Wed, 31 Jul 2013 20:32:24 +0000 "Myklebust, Trond" <Trond.Myklebust@xxxxxxxxxx> wrote: > On Wed, 2013-07-31 at 15:26 -0500, Quentin Barnes wrote: > > On Wed, Jul 31, 2013 at 1:46 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > On Wed, 31 Jul 2013 13:36:21 -0500 > > > Quentin Barnes <qbarnes@xxxxxxxxx> wrote: > > > > > >> I was reviewing patches I was integrating into my NFS source base > > >> and this one popped up (commit f6488c9b). In reviewing it, it would > > >> fix the problem, however, I think it's not the best fix for it. > > >> Catching it in nfs_find_actor() is after the horse has left the > > >> barn so to speak. > > >> > > >> I think the problem is in nfs_fhget() when constructing the new > > >> inode (inside the if (inode->i_state & I_NEW) {...}) and should be > > >> addressed there. > > >> > > >> The nfs module already has checks to ensure that the expression > > >> "((S_IFMT & inode->i_mode) == (S_IFMT & fattr->mode))" is true > > >> in nfs_update_inode() and nfs_check_inode_attributes(). I think > > >> problem Benny hit was that the equivalent check is missing in > > >> nfs_fhget() when constructing a new inode. The same check that's > > >> in those other two functions needs to be added to nfs_fhget()'s "if > > >> (inode->i_state & I_NEW) {...}" conditional block to prevent the > > >> problem of an inconsistent fattr/inode state from occurring in the > > >> first place. > > >> > > >> I can come up with a patch if you'd like, but I wanted to make > > >> sure my assertions were valid first. Is removing a check from > > >> nfs_find_actor() and adding the check to nfs_fhget() to prevent the > > >> inconsistency in the first place a better approach? > > >> > > >> Quentin > > > > > > I don't think I agree. The whole problem was that we were allowing > > > iget5_locked to match an existing inode even though (S_IFMT & mode) was > > > different. > > > > I assert that when an inconsistent state happens you don't want > > nfs_find_actor() to be the one responsible for noticing it > > (actually, ignoring the problem). You want the existing consistency > > and error recovery paths to handle it instead. > > > > So one or two things are going on. The existing error recovery > > paths are inadequate or that there's a hole in the fattr/inode > > consistency checking (or possibly both). Your patch covers over > > this underlying bug rather than addressing it and which also lets > > the old inode not get flagged as invalid. I would rather see the > > real problem fixed and the old inode get flagged rather than just > > ignoring it and creating another new inode. > > > > Before your patch, if because of a server bug reusing a file > > handle, nfs_find_actor(), would match and iget5_locked() would > > return an inode that matched on its file handle but that didn't > > necessarily match the fattr info. > > > > If that inode returned by iget5_locked() was not I_NEW (the else > > case), nfs_fhget() would invoke nfs_refresh_inode(), which would > > call nfs_refresh_inode_locked(). It calls either nfs_update_inode() > > or nfs_check_inode_attributes(). Both of those functions already > > check for fattr/inode S_IFMT consistency catching bad states. > > > > > Why would it be preferable to use the same struct inode even though > > > it's clearly a different one on the server? inodes generally don't > > > change their type after they've been created. > > > > It's not preferable to have the same inode in a different state. > > What you don't want is nfs_find_actor() as a side-effect to let the > > problem go unnoticed. You want let the normal, existing error > > handling paths deal with it, or if they're not fully dealing with > > it, fix them. > > > > Since the I_NEW conditional false path (the else case) in nfs_fhget() > > should catch the problem, I had thought that the problem must lay > > with the I_NEW conditional true path. Now I don't think so. > > > > I see now that nfs_check_inode_attributes() is just returning an > > error when the fattr and inode states are found to be inconsistent, > > but is taking no action on the bad inode's state (i.e. flagging it > > as invalid). And upon return, nfs_fhget() is not doing anything with > > the bad return state passed up to nfs_refresh_inode(). Now I'm > > thinking that this is problematic hole because it is ignoring the > > error state and that let Benny's panic happen. > > > > It doesn't seem to me that it should be nfs_check_inode_attributes() > > job to flag the inode as invalid. That seems outside its role. I > > would think it should be nfs_refresh_inode_locked()'s job, but I'm > > still reviewing the code. > > > > Thoughts? > > Quite frankly, all I care about in a situation like this is that the > client doesn't Oops. > > If a server is really this utterly broken, then there is no way we can > fix it on the client, and we're not even going to try. > Agreed. For the record, this problem came about due to a bug in the server against which Benny was testing. It's not a situation that should ever occur under normal conditions. -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html