Re: 9pfs hangs since 4.7

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 4 Jan 2017 01:47:53 +0000
Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:

> On Wed, Jan 04, 2017 at 01:34:50AM +0200, Tuomas Tynkkynen wrote:
> 
> > I got these logs from the server & client with 9p tracepoints enabled:
> > 
> > https://gist.githubusercontent.com/dezgeg/02447100b3182167403099fe7de4d941/raw/3772e408ddf586fb662ac9148fc10943529a6b99/dmesg%2520with%25209p%2520trace
> > https://gist.githubusercontent.com/dezgeg/e1e0c7f354042e1d9bdf7e9135934a65/raw/3a0e3b4f7a5229fd0be032c6839b578d47a21ce4/qemu.log
> >   
> 
> Lovely...
> 
> 27349:out 8 0001 		TSTATFS, tag 1
> 27350:out 12 0001		TLOPEN, tag 1
> 27351:complete 9 0001		RSTATFS, tag 1
> 27352:complete 13 0001		RLOPEN, tag 1
> 
> 27677:out 8 0001		TSTATFS, tag 1
> 27678:out 12 0001		TLOPEN, tag 1
> 27679:complete 9 0001		RSTATFS, tag 1
> 27680:complete 13 0001		RLOPEN, tag 1
> 
> 29667:out 8 0001		TSTATFS, tag 1
> 29668:out 110 0001		TWALK, tag 1
> 29669:complete 9 0001		RSTATFS, tag 1
> 29670:complete 7 0001		RLERROR, tag 1
> 
> 42317:out 110 0001		TWALK, tag 1
> 42318:out 8 0001		TSTATFS, tag 1
> 42319:complete 9 0001		RSTATFS, tag 1
> 42320:complete 7 0001		RERROR, tag 1
> 
> Those are outright protocol violations: tag can be reused only after either
> a reply bearing the same tag has arrived *or* TFLUSH for that tag had been
> issued and successful (type == RFLUSH) reply bearing the tag of that TFLUSH
> has arrived.  Your log doesn't contain any TFLUSH (108) at all, so it should've
> been plain and simple "no reuse until server sends a reply with matching tag".
> 

Argh, I had completely forgotten there is another 9p mount involved, so the log
is mixed between the two mounts. Apologies, there is now a new trace attached
with the V9fsState pointer dumped in the mix as well.

> Otherwise the the dump looks sane.  In the end all issued requests had been
> served, so it's not as if the client had been waiting for a free tag or for
> a response to be produced by the server.
> 
> Interesting...  dmesg doesn't seem to contain the beginning of the 9P trace -
> only 89858 out of 108818 in the dump, even though it claims to have lost
> only 4445 events.  [...]

Here's logs that should be complete this time:

https://gist.githubusercontent.com/dezgeg/08629d4c8ca79da794bc087e5951e518/raw/a1a82b9bc24e5282c82beb43a9dc91974ffcf75a/9p.qemu.log
https://gist.githubusercontent.com/dezgeg/1d5f1cc0647e336c59989fc62780eb2e/raw/d7623755a895b0441c608ddba366d20bbf47ab0b/9p.dmesg.log

> 
> Interesting...  Which version of qemu/what arguments are you using there?

This is QEMU 2.7.0, with these on the server side:
    -virtfs local,path=/nix/store,security_model=none,mount_tag=store
    -virtfs local,path=$TMPDIR/xchg,security_model=none,mount_tag=xchg

and on the client side:

store on /nix/store type 9p (rw,relatime,dirsync,trans=virtio,version=9p2000.L,cache=loose)
xchg on /tmp/xchg type 9p (rw,relatime,dirsync,trans=virtio,version=9p2000.L,cache=loose)

'store' is the one receiving the hammering, 'xchg' I think is mostly sitting idle.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux