Pavan T C <tcp@xxxxxxxxxxx> wrote: > When lookup is sent, the details of the ondisk file (particularly gfid) > is fetched. In the lookup callback, a comparison with this gfid is made > with one in the local inode cache. If this inode is found in the local > cache and the local gfid differs from the one obtained from the disk, > ESTALE is returned to FUSE. FUSE will consider this to be a revalidate > case, and should send a revalidate lookup to update it's cache, and > should *not* pass this error back to VFS. If the error is passed back to > VFS, the application can report "Stale NFS file handle". The problem here is not with LOOKUP, but with READDIR. I though that there was some race condition and the offending files were changing but this not the case: it always happens on the same files, but I have no bug if they are find by LOOKUP. Only READDIR can trigger the problem. Discover Makefile.am by READDIR gets ESTALE: client# umount /gfs && mount /gfs client# cd /gfs/usr/src/gnu/dist/binutils/bfd/ && ls -l ls: ChangeLog-9193: Stale NFS file handle ls: ChangeLog-9697: Stale NFS file handle ls: Makefile.am: Stale NFS file handle (...) -rw-r--r-- 1 root wheel 18009 Nov 26 2003 COPYING drwxr-xr-x 2 root wheel 1024 Nov 6 2010 CVS -rw-r--r-- 1 root wheel 256777 Feb 2 2006 ChangeLog -rw-r--r-- 1 root wheel 350400 Nov 26 2003 ChangeLog-0001 -rw-r--r-- 1 root wheel 442601 Dec 8 2004 ChangeLog-0203 -rw-r--r-- 1 root wheel 411353 Nov 26 2003 ChangeLog-9495 -rw-r--r-- 1 root wheel 206494 Nov 26 2003 ChangeLog-9899 (...) Discovering Makefile.am by LOOKUP is fine: client# umount /gfs && mount /gfs client# cd /gfs/usr/src/gnu/dist/binutils/bfd/ && ls -l Makefile.am -rw-r--r-- 1 root wheel 66102 Feb 2 2006 Makefile.am On the backend, I can tell the difference between files that trigger the bug and the others: The bad ones have gfid out of sync with their linkto counterpart on the other replica. The difference never heals. Here is a file that triggers a ESTALE when I discover it through READDIR: server# ls -l /export/*/usr/src/gnu/dist/binutils/bfd/Makefile.am ---------T 1 root wheel 0 Jul 22 03:33 /export/wd1a/usr/src/gnu/dist/binutils/bfd/Makefile.am -rw-r--r-- 1 root wheel 66102 Feb 2 2006 /export/wd3a/usr/src/gnu/dist/binutils/bfd/Makefile.am server# getextattr -x trusted.gfid \ /export/*/usr/src/gnu/dist/binutils/bfd/Makefile.am /export/wd1a/usr/src/gnu/dist/binutils/bfd/Makefile.am 000 3a 4c 41 78 86 59 4f ab 97 65 d5 a6 f8 c5 5f 4d :LAx.YO..e...._M /export/wd3a/usr/src/gnu/dist/binutils/bfd/Makefile.am 000 61 37 c5 59 90 83 42 88 a7 b8 ee 86 58 66 1f 13 a7.Y..B.....Xf.. And here is a file that has no problem: server# ls -l /export/*/usr/src/gnu/dist/binutils/bfd/COPYING ---------T 1 root wheell 0 Jul 19 09:45 /export/wd1a/usr/src/gnu/dist/binutils/bfd/COPYING -rw-r--r-- 1 root wheel 18009 Nov 26 2003 /export/wd3a/usr/src/gnu/dist/binutils/bfd/COPYING server# getextattr -x trusted.gfid \ /export/*/usr/src/gnu/dist/binutils/bfd/COPYING 000 0b fb b5 6a 7d c0 4d 22 98 f8 9d 6c 64 5b ab b7 ...j}.M"...ld[.. /export/wd3a/usr/src/gnu/dist/binutils/bfd/COPYING 000 0b fb b5 6a 7d c0 4d 22 98 f8 9d 6c 64 5b ab b7 ...j}.M"...ld[.. As I understand, on READDIR the glusterfs server reports the gfid of the shadow linkto file to the client, and subsequent file usage will report the correct gfid, leading to the mismatch. You suggest that the FUSE implementation should filter out ESTALE by inssuing another LOOKUP? I can implement this, but I have trouble to understand why you have a bug report on this problem if Linux FUSE does that. -- Emmanuel Dreyfus http://hcpnet.free.fr/pubz manu@xxxxxxxxxx