Hi Bruce, On Sun, Mar 1, 2015 at 2:14 PM, Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> wrote: > Hi, > > When doing testing of NFSv3 loopback mounts (client and server are on > the same IP address), I'm seeing a very reproducible hang in which the > client stops receiving data from the server. The TCP connection is still > marked as established, and the server appears to continue to receive and > send data, however the client does not. > > So far, I've reproduced on both v4.0-rc1, and the Fedora v3.18.7 kernel. > > The reproducer is simply to loopback mount using NFSv3, and then run the > 'fsx' filesystem exerciser. I'm usually able to trigger the hang with > "fsx -N 100000 foobar". > > I've attached a couple of wireshark trace of a few frames just before > and during the hang in case it jogs any memories. This bug appears to go away when I disable the splice()-based reads by clearing the RQ_SPLICE_OK flag. I noticed that it always involved a combination of a READ and a truncating SETATTR call. Are you sure that it is safe to share pagecache pages directly with sendpage() in this way? As far as I can tell, there is no locking to prevent them from being modified while in the TCP send queue. Cheers Trond -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@xxxxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html