On 11/08/2016 06:30 AM, Shyam wrote: > On 11/08/2016 08:10 AM, Ravishankar N wrote: >> So there is a class of bugs* in exposed in replicate volumes where if >> the only good copy of the file is down, we still end up serving stale >> data to the application because of caching >> in various layers outside gluster. In fuse, this can be mitigated by >> setting attribute and entry-timeout to zero so that the actual FOP >> (stat, read, write, etc) reaches AFR which will >> then fail it with EIO. But this does not work for NFS based clients. >> >> 1) Is there a way by which we can make the 'lookup' FOP in gluster do >> just that- i.e. tell whether the entry exists or not, and *not serve* >> any other (stat) information except the gfid? > > To my reading of NFSv3 this seems possible and here is what I have > gathered, > > LOOKUP RPC returns post_op_attr for the object being looked up, and > this can contain no attribute information. See [3] and [4] below. > LOOKUP also returns 'nfs_fh3 object' which is where the GFID related > information sits. > > Gluster NFS code, seems to adhere to this as seen in [5] > > NFSv3 RFC does not encourage this (reading [4] "This appears to make > returning attributes optional. However,.." states this), but is not > prevented. > > So overall with NFSv3 this seems to be doable. We possibly need data > from Ganesha implementation and NFSv4 related reading on this. > In NFS v4, the operations are performed using COMPOUND. There is a GETATTR compound op which can fail separately from the LOOKUP op. READDIR can also return attributes for each entry and is able to fail to return attributes for some entries (as long as ATTR4_RDATTR_ERR is one of the requested attributes). I know all of this works at least for directory entries and the client handles it. You can see what happens if attributes are not available by having an export that requires a security flavor (such as krb5) that is not available to the mount. If you mount the pseudofs, and the ls -l the pseudofs directory the exports are in, you will see some listings with ? for the attributes, for example: [root@localhost src]# mount /mnt4 [root@localhost src]# ls -l /mnt4 ls: cannot access /mnt4/test4: Operation not permitted ls: cannot access /mnt4/aaa: Operation not permitted total 28 ??????????? ? ? ? ? ? aaa drwxr-xr-x. 3 root root 4096 Jun 3 2015 default drwxr-xr-x. 4 root root 4096 Apr 13 2016 exp1 drwxr-xr-x. 3 root root 4096 Apr 13 2016 exp4 dr-xr-xr-x. 2 root root 4096 Sep 16 2015 none drwxr-xr-x. 2 root root 4096 Sep 16 2015 sys drwxrwxrwx. 11 root root 4096 Nov 7 10:15 test1 drwxr-xr-x. 3 root root 4096 Sep 14 13:43 test2 ??????????? ? ? ? ? ? test4 drwxr-xr-x. 3 root root 29 Apr 17 2014 xfs1 aaa and test4 are exports that require krb5 (my client is not set up with krb5). The only issue is that FSAL readdir callback API MAY not actually have a way to indicate failure of attribute fetch... Ganesha does presume on initial caching of an object that attribute fetch was successful since it needs to know a few attributes to even instantiate an inode. Frank >> >> 2) If that is not possible, it is okay for AFR to fail lookups with EIO >> if when client-quorum is met and there is no source available? The >> downside is that if we fail lookups with EIO, even unlink cannot be >> served. >> (Think of a user who doesn't want to resolve a file in split-brain, but >> rather delete it). >> >> Thanks, >> Ravi >> >> *bugs: >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1356974 >> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1224709 > [3] NFS v3 RFC definition of LOOKUP: > https://tools.ietf.org/html/rfc1813#section-3.3.3 > [4] NFS v3 definition of post_op_attr > https://tools.ietf.org/html/rfc1813#section-2.6 (search for > post_op_attr after reaching this section) > [5] Gluster NFS code pointer: > https://github.com/gluster/glusterfs/blob/master/xlators/nfs/server/src/nfs3-helpers.c#L365 > and more specifically, > https://github.com/gluster/glusterfs/blob/master/xlators/nfs/server/src/nfs3-helpers.c#L376 > >> >> >> >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@xxxxxxxxxxx >> http://www.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel