Replica 3 volume with forced quorum 1 fault tolerance and recovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It seems that consistency of replica 3 volume with quorum forced to 1 becomes
broken after a few forced volume restarts initiated after 2 brick failures.
At least it breaks GFAPI clients, and even volume restart doesn't help.

Volume setup is:

Volume Name: test0
Type: Replicate
Volume ID: 919352fb-15d8-49cb-b94c-c106ac68f072
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.1.112:/glusterfs/test0-000
Brick2: 192.168.1.112:/glusterfs/test0-001
Brick3: 192.168.1.112:/glusterfs/test0-002
Options Reconfigured:
cluster.quorum-count: 1
cluster.quorum-type: fixed
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Client is fio with the following options:

[global]
name=write
filename=testfile
ioengine=gfapi_async
volume=test0
brick=localhost
create_on_open=1
rw=randwrite
direct=1
numjobs=1
time_based=1
runtime=600

[test-4-kbytes]
bs=4k
size=1G
iodepth=128

How to reproduce:

0) start the volume;
1) run fio;
2) run 'gluster volume status', select 2 arbitrary brick processes
   and kill them;
3) make sure fio is OK;
4) wait a few seconds, then issue 'gluster volume start [VOL] force'
   to restart bricks, and finally issue 'gluster volume status' again
   to check whether all bricks are running;
5) restart from 2).

This is likely to work for a few times but, sooner or later, it breaks
at 3) and fio detects an I/O error, most probably EIO or ENOTCONN. Starting
from this point, killing and restarting fio yields in error in glfs_creat(),
and even the manual volume restart doesn't help.

NOTE: as of 7914c6147adaf3ef32804519ced850168fff1711, fio's gfapi_async
engine is still incomplete and _silently ignores I/O errors_. Currently
I'm using the following tweak to detect and report them (YMMV, consider
experimental):

diff --git a/engines/glusterfs_async.c b/engines/glusterfs_async.c
index 0392ad6e..27ebb6f1 100644
--- a/engines/glusterfs_async.c
+++ b/engines/glusterfs_async.c
@@ -7,6 +7,7 @@
 #include "gfapi.h"
 #define NOT_YET 1
 struct fio_gf_iou {
+	struct thread_data *td;
 	struct io_u *io_u;
 	int io_complete;
 };
@@ -80,6 +81,7 @@ static int fio_gf_io_u_init(struct thread_data *td, struct io_u *io_u)
     }
     io->io_complete = 0;
     io->io_u = io_u;
+    io->td = td;
     io_u->engine_data = io;
 	return 0;
 }
@@ -95,7 +97,20 @@ static void gf_async_cb(glfs_fd_t * fd, ssize_t ret, void *data)
 	struct fio_gf_iou *iou = io_u->engine_data;

 	dprint(FD_IO, "%s ret %zd\n", __FUNCTION__, ret);
-	iou->io_complete = 1;
+	if (ret != io_u->xfer_buflen) {
+		if (ret >= 0) {
+			io_u->resid = io_u->xfer_buflen - ret;
+			io_u->error = 0;
+			iou->io_complete = 1;
+		} else
+			io_u->error = errno;
+	}
+
+	if (io_u->error) {
+		log_err("IO failed (%s).\n", strerror(io_u->error));
+		td_verror(iou->td, io_u->error, "xfer");
+	} else
+		iou->io_complete = 1;
 }

 static enum fio_q_status fio_gf_async_queue(struct thread_data fio_unused * td,

--

Dmitry
_______________________________________________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux