Re: error=Invalid slot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chuck,


On Mon, 2019-04-15 at 11:04 -0400, Chuck Lever wrote:
> Just happened again. Any thoughts about where I should start looking?
> 
> Mon Apr 15 11:01:40 EDT 2019
> 4k100test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B,
> (T) 4096B-4096B, ioengine=libaio, iodepth=1024
> ...
> fio-3.1
> Starting 12 processes
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> 4k100test: Laying out IO file (1 file / 1024MiB)
> fio: native_fallocate call failed: Operation not supported
> fio: io_u error on file 4k100test.7.0: Invalid slot: read
> offset=938229760, buflen=4096

Does the following patch fix the race?

8<--------------------------------------
>From 4c8759eafad9bb7ea2626a53296e30618aeefcc7 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
Date: Mon, 15 Apr 2019 11:54:13 -0400
Subject: [PATCH] SUNRPC: Ignore queue transmission errors on successful
 transmission

If a request transmission fails due to write space or slot unavailability
errors, but the queued task then gets transmitted before it has time to
process the error in call_transmit_status() or call_bc_transmit_status(),
we need to suppress the transmission error code to prevent it from leaking
out of the RPC layer.

Reported-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
---
 net/sunrpc/clnt.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index fa900bb44cd5..369a2648dafc 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -2101,8 +2101,8 @@ call_transmit_status(struct rpc_task *task)
 	 * test first.
 	 */
 	if (rpc_task_transmitted(task)) {
-		if (task->tk_status == 0)
-			xprt_request_wait_receive(task);
+		task->tk_status = 0;
+		xprt_request_wait_receive(task);
 		return;
 	}
 
@@ -2187,6 +2187,9 @@ call_bc_transmit_status(struct rpc_task *task)
 {
 	struct rpc_rqst *req = task->tk_rqstp;
 
+	if (rpc_task_transmitted(task))
+		task->tk_status = 0;
+
 	dprint_status(task);
 
 	switch (task->tk_status) {
-- 
2.20.1

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux