Hi On 05/12/2018 09:14, Cristian Marussi wrote: > Hi > > On 04/12/2018 19:31, Trond Myklebust wrote: >> On Tue, 2018-12-04 at 14:24 -0500, Trond Myklebust wrote: >>> The RPC code is occasionally hanging when the receive code fails to >>> empty the socket buffer due to a partial read of the data. When we >>> convert that to an EAGAIN, it appears we occasionally leave data in >>> the >>> socket. The fix is to just keep reading until the socket returns >>> EAGAIN/EWOULDBLOCK. >>> >>> Reported-by: Catalin Marinas <catalin.marinas@xxxxxxx> >>> Reported-by: Cristian Marussi <cristian.marussi@xxxxxxx> >>> Reported-by: Chuck Lever <chuck.lever@xxxxxxxxxx> >>> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >>> --- >> [snip] Applying the patch on top of your linux-nfs next: 8739cbe10efb (HEAD -> linux-next) SUNRPC: Fix RPC receive hangs 0a9a4304f361 (origin/linux-next) SUNRPC: Fix a potential race in xprt_connect() 71700bb96047 SUNRPC: Fix a memory leak in call_encode() 8dae5398ab1a SUNRPC: Fix leak of krb5p encode pages and testing on arm64 64k pages without rsize workaround. SOLVES for me. No hang or slowdown launching LKP/LTP. dbench results seems fine again.(as with rsize workaround) Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 106349 11.479 400.908 Close 78111 16.508 414.883 Rename 4500 19.722 246.805 Unlink 21475 3.625 196.797 Qpathinfo 96460 1.786 278.724 Qfileinfo 16829 10.044 233.708 Qfsinfo 17615 2.119 319.131 Sfileinfo 8700 16.819 145.051 Find 37251 3.389 264.889 WriteX 52509 0.048 6.663 ReadX 166675 0.655 189.954 LockX 348 11.840 227.481 UnlockX 348 2.563 72.949 Flush 7470 20.296 274.855 Throughput 5.50353 MB/sec 6 clients 6 procs max_latency=414.902 ms Thanks Cristian