On Fri, Feb 14, 2014 at 11:38:25AM -0500, Josef Bacik wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > > On 02/14/2014 11:14 AM, J. Bruce Fields wrote: > > On Fri, Feb 14, 2014 at 07:49:35AM -0800, Eric W. Biederman wrote: > >> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> writes: > >> > >>> On Thu, Feb 13, 2014 at 08:25:43PM -0800, Eric W. Biederman > >>> wrote: > >>>> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> writes: > >>>> > >>>>> On Thu, Feb 13, 2014 at 03:45:16PM -0800, Eric W. Biederman > >>>>> wrote: > >>>>>> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> writes: > >>>>>> > >>>>>>> Yesterday you passed on a report of this printk from > >>>>>>> nfsdfh.c firing: > >>>>>>> > >>>>>>> printk("nfsd: find_fh_dentry returned a DISCONNECTED > >>>>>>> directory: %pd2\n", dentry); > >>>>>>> > >>>>>>> I think the dentry probably comes from the FILEID_ROOT > >>>>>>> case of: > >>>>>>> > >>>>>>> if (fileid_type == FILEID_ROOT) dentry = > >>>>>>> dget(exp->ex_path.dentry); else { dentry = > >>>>>>> exportfs_decode_fh(exp->ex_path.mnt, fid, data_left, > >>>>>>> fileid_type, nfsd_acceptable, exp); } > >>>>>>> > >>>>>>> In that case the dentry was found using ordinary > >>>>>>> filesystem lookups, so doesn't go through the same > >>>>>>> DISCONNECTED-clearing logic as in the case of lookups > >>>>>>> by filehandle. > >>>>>>> > >>>>>>> Probably they have an export root that's not a > >>>>>>> filesystem root, and the lookups happened in the right > >>>>>>> order? > >>>>>>> > >>>>>>> I suspect that's fine, and that the printk is just > >>>>>>> stupid, but maybe we should clear DISCONNECTED when > >>>>>>> possible on normal lookups. The following is my > >>>>>>> attempt, though I'm not sure if d_alloc is the right > >>>>>>> place to do this. In any case it might help confirm > >>>>>>> this is what's happening. > >>>>>>> > >>>>>>> So if you pass along this patch to the person who was > >>>>>>> seeing that printk I'd be interested in the results. > >>>>>> > >>>>>> I have been reading through the dentry code for other > >>>>>> reasons and your patch definitely won't change anything. > >>>>>> __d_alloc sets d_flags = 0. Therefore d_alloc always > >>>>>> returns with d_flags == 0. > >>>>> > >>>>> You're right, of course. I wasn't thinking straight. > >>>>> > >>>>> So the only dentries with DISCONNECTED set are those > >>>>> created with d_obtain_alias, which is normally only used > >>>>> when you're looking up by filehandle. > >>>>> > >>>>> Except btrfs has a weird use in get_default_root(). So > >>>>> maybe they were running into the dentry that created? > >>>>> > >>>>> So btrfs should probably be using something else, I'm not > >>>>> sure what. > >>>> > >>>> The nfs client also has the case where it uses DISCONNECTED > >>>> dentries for directories that are not root on the server. > >>>> Which seems very similiar to the btrfs case. > >>> > >>> I don't think there's any reason for those to be flagged > >>> DISCONNECTED either. > >> > >> The only practical difference between the two cases is how > >> quickly it is desirable to connect the entries. > >> > >> The disconnected dentries processed by exportfs are dentries that > >> we want to connect immediately, and it is an error/problem to > >> have the disconnected after processing. > >> > >> The dentries that are the roots of file systems we want to > >> connect them if we get the chance with d_materialise_unique but > >> we don't care if they go long periods without being connected. > >> > >> I believe we want both groups of dentries on the s_anon list so > >> that if they remain disconnected when the filesystem is unmounted > >> we can find them and deal with them. > > > > Note it's IS_ROOT(), not DCACHE_DISCONNECTED, that determines > > whether a hashed dentry is on s_anon or not. (See > > 7632e465feb182cadc3c9aa1282a057201818a8c for more detailed > > discussion.) > > > >> I can see distinguishing between dentries that are supposed to > >> be disconnected for a short time, and dentries that are supposed > >> to be disconnected indefinitely but we currently (as of 3.14-rc1) > >> don't have that distinction. > > > > I believe we do: DCACHE_DISCONNECTED is for the former, and the > > latter should be IS_ROOT(), !DCACHE_DISCONNECTED. > > > > DCACHE_DISCONNECTED was intended to be used only for dentries > > created while performing lookup-by-filehandle which have not yet > > been confirmed to be linked back to filesystem root by a chain of > > ->d_parent pointers. > > > >> But a blanket statement that the long term disconnected dentries > >> are doing it wrong seems off base. > >> > >> If those dentires can tolerate not being on the s_anon list > >> d_alloc_pseudo or d_make_root looks like it will serve just as > >> well > > > > The difference is that d_alloc_pseudo and d_make_root > > unconditionally create new dentries, whereas d_materialise_unique > > lets us reuse an existing dentry. > > > > (In the NFS client case, I believe this happens for example when > > you mount export A from a server, then export B from the same > > server, and then one day you look up A/foo and find out that it's > > the same directory as B's root. You don't want to duplicate the > > inode or give it multiple dentries, so you instead reuse the > > existing IS_ROOT dentry for B to represent A/foo. > > > > In the btrfs case I guess it's a similar situation but with two > > subvolumes instead of two exports?) > > > >> from the perspective of d_materialise_unique, but that leaves me > >> with the queasy feeling that we will leak dentries and inodes > >> when we unmount the filesystems in question, if those dentries > >> have never been attached. > > > > In the NFS server (lookup-by-filehandle) case, I believe dentries > > in the process of being reconnected are all children of some > > IS_ROOT dentry which is on the s_anon list, so everyone's accounted > > for. > > > > > > Thanks for looking at this as I've found myself easily confused by > > it all.... (And judging by some of the code in the tree I'm not > > alone.) > > > > Ok using d_make_root makes us leak inodes on umount (I don't know why, > I'm not drunk enough yet.) Oh, well, a little early in the morning, but that's a solveable problem. > I haven't tried it yet but I'm almost 100% > positive if I just clear out DISCONNECTED we will BUG_ON() in > d_splice_alias() when we go to walk into the subvol we've mounted with > the default subvol option. I'd expect it not to BUG_ON(), but instead to fall into the !new case and create a second alias for that directory. Which probably actually causes the same problems d_make_root does. > So what I think I need to do is use use > d_materialise_unique in our lookup function instead of > d_splice_alias() and keep using d_obtain_alias() in get_default_root() > and just clear DISOCONNECTED on success. Does this sound crazy to > anybody else? Thanks, Probably not crazy. --b. > > Josef > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1 > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQIcBAEBAgAGBQJS/kaBAAoJEANb+wAKly3BXp8P/RiWEPYsJXwmD9N2JamQZawa > p5Fh507ZX3PRaOAZ6zgWB2uN4xRQtqFpsIjoZl6c2NoxnR4L89d4x/ExBF39WE2t > PmK3Ot/KJ6GXBVHSyexWJMix8R8u40eKHJcyywqj35HhHML47Ll1fy6k2vmRno5q > FVP+E0kLiBOH0W2ae8o7kPImLp2Ue7moDvExlqKTI3jeRPQI5u+/lH4uj8oVQXx4 > /HNoZ7+htzi0Eb+mA+ve0k2/efiUlPloerE0oFUaCrrOF0HMLAVoXxhiDR7eDM96 > nZ1LHZ422DOBRCBQ6kpDo7iAebQyRKx6rdkdr/n5/5Bs0iTevg3bWy8WEbMP+NvW > OR+mQ4r3X8J7t91Dp15OJuiEPhyH6lb2McPcxq/ozRDcGR0enZ1iYGl/GNw02eaw > 1/JS4+nwokLCcEvTiW16n8sLGv+iU4fXawisjdKs76KQGO+rzdOd2msYjnOF8ZgT > YTO4k689qlWJu6TtY8iC0fRStIksTAMesMmCoCBl+2zkGsHMS9tMQt3kzIF91UP4 > WTuZOBxdqByKmQiWy0INKhYXOSVxAUs27JuaHPIBUz65fDDu80IW+2Ih1sro73y/ > +t9+7vSueviUub1azUocEdYxuc8mDQ7totpzrJBYHCF7C1T2DZV5nEpGUHD/d19R > 3QwsV2oPiu230jh+nJR2 > =VIUq > -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html