Yes, this is a GlusterFS problem. Adding gluster users ML
On Thu, Sep 29, 2016 at 5:11 PM, Davide Ferrari <davide@xxxxxxxxxxxx> wrote:
I created the volume originally this way:Now, how is it possible to have a split brain if I stopped just ONE server which had just ONE of six bricks, and it was cleanly shut down with maintenance mode from ovirt?Looking in the gluster logs I can see this:Now, I've put in maintenance the vm04 host, from ovirt, ticking the "Stop gluster" checkbox, and Ovirt didn't complain about anything. But when I tried to run a new VM it complained about "storage I/O problem", while the storage data status was always UP.The problem is simple: I have a data domain mappend on a replica 3 arbiter1 Gluster volume with 6 bricks, like this:Hellomaybe this is more glustefs then ovirt related but since OVirt integrates Gluster management and I'm experiencing the problem in an ovirt cluster, I'm writing here.
Status of volume: data_ssd
Gluster processTCP Port RDMA Port Online Pid
------------------------------------------------------------ ------------------
Brick vm01.storage.billy:/gluster/ssd/data/
brick49153 0 Y 19298
Brick vm02.storage.billy:/gluster/ssd/data/
brick49153 0 Y 6146
Brick vm03.storage.billy:/gluster/ssd/data/
arbiter_brick49153 0 Y 6552
Brick vm03.storage.billy:/gluster/ssd/data/
brick49154 0 Y 6559
Brick vm04.storage.billy:/gluster/ssd/data/
brick49152 0 Y 6077
Brick vm02.storage.billy:/gluster/ssd/data/
arbiter_brick49154 0 Y 6153
Self-heal Daemon on localhost N/A N/A Y 30746
Self-heal Daemon on vm01.storage.billy N/A N/A Y 196058
Self-heal Daemon on vm03.storage.billy N/A N/A Y 23205
Self-heal Daemon on vm04.storage.billy N/A N/A Y 8246
[2016-09-29 11:01:01.556908] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2016-09-29 11:02:28.124151] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-data_ssd-replicate-1: Failing READ on gfid bf5922b7-19f3-4ce3-98df- 71e981ecca8d: split-brain observed. [Input/output error]
[2016-09-29 11:02:28.126580] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-data_ssd-replicate-1: Unreadable subvolume -1 found with event generation 6 for gfid bf5922b7-19f3-4ce3-98df- 71e981ecca8d. (Possible split-brain)
[2016-09-29 11:02:28.127374] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-data_ssd-replicate-1: Failing FGETXATTR on gfid bf5922b7-19f3-4ce3-98df- 71e981ecca8d: split-brain observed. [Input/output error]
[2016-09-29 11:02:28.128130] W [MSGID: 108027] [afr-common.c:2403:afr_discover_done] 0-data_ssd-replicate-1: no read subvols for (null)
[2016-09-29 11:02:28.129890] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 8201: READ => -1 gfid=bf5922b7-19f3-4ce3-98df- 71e981ecca8d fd=0x7f09b749d210 (Input/output error)
[2016-09-29 11:02:28.130824] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-data_ssd-replicate-1: Failing FSTAT on gfid bf5922b7-19f3-4ce3-98df- 71e981ecca8d: split-brain observed. [Input/output error]
[2016-09-29 11:02:28.133879] W [fuse-bridge.c:767:fuse_attr_cbk] 0-glusterfs-fuse: 8202: FSTAT() /ba2bd397-9222-424d-aecc- eb652c0169d9/images/f02ac1ce- 52cd-4b81-8b29-f8006d0469e0/ ff4e49c6-3084-4234-80a1- 18a67615c527 => -1 (Input/output error)
The message "W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-data_ssd-replicate-1: Unreadable subvolume -1 found with event generation 6 for gfid bf5922b7-19f3-4ce3-98df- 71e981ecca8d. (Possible split-brain)" repeated 11 times between [2016-09-29 11:02:28.126580] and [2016-09-29 11:02:28.517744]
[2016-09-29 11:02:28.518607] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-data_ssd-replicate-1: Failing STAT on gfid bf5922b7-19f3-4ce3-98df- 71e981ecca8d: split-brain observed. [Input/output error]
# gluster volume create data_ssd replica 3 arbiter 1 vm01.storage.billy:/gluster/ssd/data/brick vm02.storage.billy:/gluster/ ssd/data/brick vm03.storage.billy:/gluster/ ssd/data/arbiter_brick vm03.storage.billy:/gluster/ ssd/data/brick vm04.storage.billy:/gluster/ ssd/data/brick vm02.storage.billy:/gluster/ ssd/data/arbiter_brick
# gluster volume set data_ssd group virt
# gluster volume set data_ssd storage.owner-uid 36 && gluster volume set data_ssd storage.owner-gid 36
# gluster volume start data_ssd
--Davide FerrariSenior Systems Engineer
_______________________________________________
Users mailing list
Users@xxxxxxxxx
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users