On Wed, 2018-02-07 at 09:03 +0100, Mkrtchyan, Tigran wrote: > If I understand you correctly, then completion time is the sum of all > IO requests sent from a client to the server. Yes. Please see the draft flexfiles spec, which documents this (and please let Tom know if you find his text to be unclear). > Then with average per-op round-trip of 2.7s I should see at lease one > request which takes longer than average. > However, the longest rpc request round-trip time I see with wireshark > is 170ms. Do I miss something? In that case, I suspect most of the 2ms is congestion in your RPC layer. The intention of the layoutstats is to convey the _application_ view of performance, so it does include queue latency while waiting to write to the socket. > > Thanks, > Tigran. > > ----- Original Message ----- > > From: "Trond Myklebust" <trondmy@xxxxxxxxxxxxxxx> > > To: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx>, "linux-nfs" <lin > > ux-nfs@xxxxxxxxxxxxxxx> > > Sent: Tuesday, February 6, 2018 11:29:48 PM > > Subject: Re: wrong latency info for flexfiles on layout return > > On Tue, 2018-02-06 at 23:10 +0100, Mkrtchyan, Tigran wrote: > > > > > > Hi Trond et al. > > > > > > by exploring data provided on layoutreturn I can see that some > > > information is wrong, in particular the IO latency completition > > > time for write: > > > > > > { > > > "iostats":[ > > > { > > > "deviceAddr":"tcp://192.168.178.40.128.2", > > > "duration":{ > > > "seconds":5, > > > "nseconds":285924462 > > > }, > > > "readInfo":{ > > > "bytes":0, > > > "latency":{ > > > "completitionTime":{ > > > "seconds":0, > > > "nseconds":0 > > > }, > > > "bytesRequested":0, > > > "opsCompleted":0, > > > "busyTime":{ > > > "seconds":0, > > > "nseconds":0 > > > }, > > > "opsRequested":0, > > > "bytesCompleted":0, > > > "bytesNotDelivered":0 > > > }, > > > "count":0 > > > }, > > > "fh":"01caffee00000000b1f9635e000d0000080000000000000005 > > > 0130 > > > ", > > > "writeInfo":{ > > > "bytes":624193536, > > > "latency":{ > > > "completitionTime":{ > > > "seconds":25737, > > > "nseconds":816217921 > > > }, > > > "bytesRequested":624193536, > > > "opsCompleted":9526, > > > "busyTime":{ > > > "seconds":5, > > > "nseconds":224017937 > > > }, > > > "opsRequested":9526, > > > "bytesCompleted":624193536, > > > "bytesNotDelivered":0 > > > }, > > > "count":9526 > > > }, > > > "deviceId":"00000001000000000000000000000000", > > > "open-stateid":"5a7a259b0000000100000004" > > > } > > > ] > > > } > > > > > > > > > you can see this with wireshark as well. > > > > > > > Can you explain what you think is wrong? With the above aggregate > > completion time, I get an average per-op round-trip time of > > 25737.816/9526 = 2.7s, which looks a little high given that the > > duration is only 5 seconds, but it's not horribly unlikely if your > > DSes > > are processing most of these operations in parallel. > > > > -- > > Trond Myklebust > > Linux NFS client maintainer, PrimaryData > > trond.myklebust@xxxxxxxxxxxxxxx > > N�����r��y���b�X��ǧv�^�){.n�+����{���"��^n�r���z���h����&���G���h� > > (�階�ݢj"���m�����z�ޖ���f���h���~�m� > > -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@xxxxxxxxxxxxxxx ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥