On Jul 5, 2012, at 3:55 PM, Myklebust, Trond wrote: > On Thu, 2012-07-05 at 20:23 +0100, Charles 'Boyo wrote: >>>> What isn't expected behaviour is for the client to DELEGRETURN >>>> immediately upon receiving the delegation. Nor is it expected that the >>>> READ would fail to recover in an ordinary DELEGRETURN situation. >>>> >>>> This is why I'm interested in seeing what happened before the OPEN. >>>> >>> >>> Attached is a minimally redacted trace which contains all the NFS >>> calls and responses before and through this particular DELEGRETURN >>> (frame 6084). >>> I am unable to provide the full packet capture due to the sensitivity >>> of the data contained therein. >> >> Trond, did the attached trace confirm the issue as suspected? >> >> Charles > > No. What is happening is that the client does an OPEN of file > "account.info". It then closes the file and for some reason that I don't > yet understand, it decides to return the delegation. > > Then the application does another OPEN, which races with the delegation > return. Because of the race, the server returns the same delegation as > it did in the first open call. The client (which has now forgotten about > the original open) sees what it thinks is a new delegation, and so it > adopts it. > > So this explains why the READs end up trying to use a returned > delegation. What remains to be explained is: > > a) Why did the client return the delegation in the first place? Is it in > some extreme memory pressure situation, or does it perhaps have some > funky setting for /proc/sys/vm/vfs_cache_pressure? > > b) Why doesn't the recovery scenario kick in and do the right thing? My guess is that EL6.2 kernels don't have the mainline patch that adds support for recovering a single bad state ID. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html