On Mon, Feb 7, 2011 at 9:02 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > On Mon, 7 Feb 2011, Brian Chrisman wrote: >> My goal is merely to push this problem back on the NFSv4 client. If I >> can tell it "your filehandle is expired", it should request a full >> path lookup to re-establish the filehandle, as far as I can tell from >> the specs. > > Oh I see... > >> I converted the ESTALE returns to NFS4ERR_FHEXPIRED in export.c to >> take a stab at accomplishing this, but there are still ESTALEs coming >> across the wire to the client. That's why I was looking to see where >> other things could be going wrong. > > Hmm. Generally speaking, the inode is in the ceph client's cache _only_ > if it has an MDS capability, and as long as it holds the capability it has > a stateful handle on it and will never get ESTALE. So my guess is it's > coming from somewhere else in export.c. Can you crank up client debugging > (see below) and reproduce? > > sage > > > > Crank up debugging on just about everything. Or you can just turn up > export.c... > > $ cat /home/sage/ceph/src/script/kcon_most.sh > #!/bin/sh -x Cranking up only export.c, I see mds communication troubles when I get the NFS stale fh. I also used your full-debug script and ran the same test (copying over my build environment, building, deleting), but the copy operation hosed the cluster with osds going down and all kinds of other stuff (this is probably an artifact of the kernel client being on the same node as osds etc). The previous messages (just export.c debugging): ceph: export.c:171 : __cfh_to_dentry 10000005a56 ffff880004176850 dentry ffff88003b788a80 ceph: export.c:133 : __cfh_to_dentry 10000005a56 (1000000145a/1c60f9f) ceph: export.c:171 : __cfh_to_dentry 10000005a56 ffff880004176850 dentry ffff88003b788a80 ceph: mds0 reconnect start ceph: mds0 reconnect success ceph: export.c:133 : __cfh_to_dentry 10000005a56 (1000000145a/1c60f9f) ceph: export.c:171 : __cfh_to_dentry 10000005a56 ffff880004176850 dentry ffff88003b788a80 ceph: mds0 recovery completed libceph: mds0 10.200.98.105:6800 socket closed libceph: mds0 10.200.98.105:6800 connection failed ceph: mds0 reconnect start ceph: mds0 reconnect success ceph: mds0 recovery completed libceph: mds0 10.200.98.111:6813 socket closed libceph: mds0 10.200.98.111:6813 connection failed libceph: mds0 10.200.98.111:6813 connection failed libceph: mds0 10.200.98.111:6813 connection failed libceph: mds0 10.200.98.111:6813 connection failed libceph: mds0 10.200.98.111:6813 connection failed libceph: mds0 10.200.98.111:6813 connection failed ceph: export.c:133 : __cfh_to_dentry 10000005a56 (1000000145a/1c60f9f) ceph: export.c:171 : __cfh_to_dentry 10000005a56 ffff880004176850 dentry ffff88003b788a80 libceph: mds0 10.200.98.111:6813 connection failed ceph: mds0 caps stale ceph: mds0 caps stale libceph: mds0 10.200.98.111:6813 connection failed ceph: export.c:133 : __cfh_to_dentry 10000005a56 (1000000145a/1c60f9f) ceph: export.c:171 : __cfh_to_dentry 10000005a56 ffff880004176850 dentry ffff88003b788a80 libceph: mds0 10.200.98.111:6813 connection failed -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html