On 16/09/19 7:34 pm, Erik Jacobson wrote:
Example errors: ex1 [2019-09-06 18:26:42.665050] E [MSGID: 108008] [afr-read-txn.c:123:afr_read_txn_refresh_done] 0-cm_shared-replicate-1: Failing ACCESS on gfid ee3f5646-9368-4151-92a3-5b8e7db1fbf9: split-brain observed. [Input/output error]
Okay so 0-cm_shared-replicate-1 means these 3 bricks: Brick4: 172.23.0.6:/data/brick_cm_shared Brick5: 172.23.0.7:/data/brick_cm_shared Brick6: 172.23.0.8:/data/brick_cm_shared
ex2 [2019-09-06 18:26:55.359272] E [MSGID: 108008] [afr-read-txn.c:123:afr_read_txn_refresh_done] 0-cm_shared-replicate-1: Failing READLINK on gfid f2be38c2-1cd1-486b-acad-17f2321a18b3: split-brain observed. [Input/output error] [2019-09-06 18:26:55.359367] W [MSGID: 112199] [nfs3-helpers.c:3435:nfs3_log_readlink_res] 0-nfs-nfsv3: /image/images_ro_nfs/toss-20190730/usr/lib64/libslurm.so.32 => (XID: 88651c80, READLINK: NFS: 5(I/O error), POSIX: 5(Input/output error)) target: (null) The errors seem to happen only on the 'replicate' volume where one server is down in the subvolume (of course, any NFS server will trigger that when it accesses the files on the degraded volume).
Were there any pending self-heals for this volume? Is it possible that the server (one of Brick 4, 5 or 6 ) that is down had the only good copy and the other 2 online bricks had a bad copy (needing heal)? Clients can get EIO in that case.
When you say accessing the file from the compute nodes afterwards works fine, it is still with that one server (brick) down?
There was a case of AFR reporting spurious split-brain errors but that was fixed long back (http://review.gluster.org/16362) and seems to be present in glusterf-4.1.6.
Side note: Why are you using replica 9 for the ctdb volume? All development/tests are usually done on (distributed) replica 3 setup.
Thanks, Ravi ________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users