----- Original Message ----- > From: "Soumya Koduri" <skoduri@xxxxxxxxxx> > To: "Mohammed Rafi K C" <rkavunga@xxxxxxxxxx>, "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>, "Niels de Vos" > <ndevos@xxxxxxxxxx>, "Raghavendra Talur" <rtalur@xxxxxxxxxx>, "Poornima Gurusiddaiah" <pgurusid@xxxxxxxxxx> > Cc: "+rhs-zteam" <rhs-zteam@xxxxxxxxxx>, "Rajesh Joseph" <rjoseph@xxxxxxxxxx>, "jtho >> Jiffin Thottan" > <jthottan@xxxxxxxxxx> > Sent: Wednesday, May 11, 2016 3:55:05 PM > Subject: Re: gfapi, readdirplus and forced lookup after inode_link > > > > On 05/11/2016 12:41 PM, Mohammed Rafi K C wrote: > > > > > > On 05/11/2016 12:28 PM, Soumya Koduri wrote: > >> Hi Raghavendra, > >> > >> > >> > >> On 05/11/2016 12:01 PM, Raghavendra Gowdappa wrote: > >>> Hi all, > >>> > >>> There are certain code-paths where the layers managing inodes (gfapi, > >>> fuse, nfsv3 etc) need to do a lookup even though the inode is found > >>> in inode-table. readdirplus is one such codepath (but not only one). > >>> The reason for doing this is that > >>> 1. not all xlators have enough information in readdirp_cbk to make > >>> inode usable (for eg., dht cannot build layout for directory inodes). > >>> 2. There are operations (like dht directory self-healing) which are > >>> needed for maintaining internal consistency and these operations > >>> cannot be done in readdirp. > >>> > >>> This forcing of lookup on a linked inode is normally achieved in two > >>> ways: > >>> 1. lower layers (like dht) setting entry->inode to NULL (without > >>> entry->inode, interface layers cannot link the inode). > >> > >> Rafi (CC'ed) had made changes to fix readdirp specific issue (required > >> for tiered volumes) as part of http://review.gluster.org/#/c/14109/ to > >> do explicit lookup if either entry->inode is set to NULL or inode_ctx > >> is NULL in gfapi. And I think he had made similar changes for > >> gluster-NFS as well to provide support for tiered volumes. I am not > >> sure if it is handled in common resolver code-path. Have to look at > >> the code. Rafi shall be able to confirm it. > > > > The changes I made in the three access layers are for inodes which was > > linked from lower layers. Which means the inodes linked from lower layer > > won't have inode ctx set in upper xlators, ie, during resolving we will > > send explicit lookup. > > > > With this changes during resolve if inode_ctx is not set then it will > > send a lookup + if set_need_lookup flag is set in inode_ctx, then also > > we will send a lookup > > > > As Du mentioned, readdirp set need_lookup everytime for entries in > > readdirp, I saw that code in fuse, and gfapi. But I don't remember such > > code in gNFS. > > There are checks for "entry->inode == NULL" in gNFS case as well. Looks > like it was Jiffin who made those changes (again wrt to tiered volumes) > - http://review.gluster.org/#/c/12960/ > > But all these checks seem to be in only readdirp_cbk codepath where > directory entries are filled. What are other fops which need such > special handling? There are some codepaths, where linking is done by xlators who don't do resolution. A rough search shows following components: 1. quota enforcer 2. bitrot 3. dht/tier (needed, but currently not doing). 4. trash (for .trash I suppose) However, none of these are explicitly setting need_lookup. So, there are windows of time where lookup is partially complete in an xlator graph, but other fops start using them. I am currently working on a fix to solve the issue for dht/tier on fuse. We have to do similar work on other xlators/interface layers too. > > Thanks, > Soumya > > > > > > Regards > > Rafi KC > > > >> > >> > >> Thanks, > >> Soumya > >> > >>> 2. interface layers (at least fuse) setting a flag in inode to let > >>> resolver know that a lookup is to be done before resuming the fop. > >>> > >>> I am sure that fuse-bridge does this correctly. Need inputs from you > >>> about the behavior of other interface layers like gfapi, nfsv3 etc. > >>> > >>> regards, > >>> Raghavendra > >>> > > > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel