It's strange, I've tried to trigger the error again by putting vm04 in maintenence and stopping the gluster service (from ovirt gui) and now the VM starts correctly. Maybe the arbiter indeed blamed the brick that was still up before, but how's that possible?
The only (maybe big) difference with the previous, erroneous situation, is that before I did maintenence (+ reboot) of 3 of my 4 hosts, maybe I should have left more time between one reboot and another?2016-09-29 14:16 GMT+02:00 Ravishankar N <ravishankar@xxxxxxxxxx>:
Does `gluster volume heal data_ssd info split-brain` report that the file is in split-brain, with vm04 still being down?On 09/29/2016 05:18 PM, Sahina Bose wrote:
Yes, this is a GlusterFS problem. Adding gluster users ML
On Thu, Sep 29, 2016 at 5:11 PM, Davide Ferrari <davide@xxxxxxxxxxxx> wrote:
Looking in the gluster logs I can see this:Now, I've put in maintenance the vm04 host, from ovirt, ticking the "Stop gluster" checkbox, and Ovirt didn't complain about anything. But when I tried to run a new VM it complained about "storage I/O problem", while the storage data status was always UP.The problem is simple: I have a data domain mappend on a replica 3 arbiter1 Gluster volume with 6 bricks, like this:Hellomaybe this is more glustefs then ovirt related but since OVirt integrates Gluster management and I'm experiencing the problem in an ovirt cluster, I'm writing here.
Status of volume: data_ssd
Gluster processTCP Port RDMA Port Online Pid
------------------------------------------------------------ ------------------
Brick vm01.storage.billy:/gluster/ssd/data/
brick49153 0 Y 19298
Brick vm02.storage.billy:/gluster/ssd/data/
brick49153 0 Y 6146
Brick vm03.storage.billy:/gluster/ssd/data/
arbiter_brick49153 0 Y 6552
Brick vm03.storage.billy:/gluster/ssd/data/
brick49154 0 Y 6559
Brick vm04.storage.billy:/gluster/ssd/data/
brick49152 0 Y 6077
Brick vm02.storage.billy:/gluster/ssd/data/
arbiter_brick49154 0 Y 6153
Self-heal Daemon on localhost N/A N/A Y 30746
Self-heal Daemon on vm01.storage.billy N/A N/A Y 196058
Self-heal Daemon on vm03.storage.billy N/A N/A Y 23205
Self-heal Daemon on vm04.storage.billy N/A N/A Y 8246
[2016-09-29 11:01:01.556908] I [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2016-09-29 11:02:28.124151] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-data_ssd-replicate-1: Failing READ on gfid bf5922b7-19f3-4ce3-98df-71e981 ecca8d: split-brain observed. [Input/output error]
[2016-09-29 11:02:28.126580] W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-data_ssd-replicate-1: Unreadable subvolume -1 found with event generation 6 for gfid bf5922b7-19f3-4ce3-98df-71e981 ecca8d. (Possible split-brain)
[2016-09-29 11:02:28.127374] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-data_ssd-replicate-1: Failing FGETXATTR on gfid bf5922b7-19f3-4ce3-98df-71e981 ecca8d: split-brain observed. [Input/output error]
[2016-09-29 11:02:28.128130] W [MSGID: 108027] [afr-common.c:2403:afr_discover_done] 0-data_ssd-replicate-1: no read subvols for (null)
[2016-09-29 11:02:28.129890] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 8201: READ => -1 gfid=bf5922b7-19f3-4ce3-98df-7 1e981ecca8d fd=0x7f09b749d210 (Input/output error)
[2016-09-29 11:02:28.130824] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-data_ssd-replicate-1: Failing FSTAT on gfid bf5922b7-19f3-4ce3-98df-71e981 ecca8d: split-brain observed. [Input/output error]
If yes, could you provide the extended attributes of this gfid from all 3 bricks:
getfattr -d -m . -e hex /path/to/brick/bf/59/bf5922b7-19f3-4ce3-98df-71e981ecca8d
If no, then I'm guessing that it is not in actual split-brain (hence the 'Possible split-brain' message). If the node you brought down contains the only good copy of the file (i.e the other data brick and arbiter are up, and the arbiter 'blames' this other brick), all I/O is failed with EIO to prevent file from getting into actual split-brain. The heals will happen when the good node comes up and I/O should be allowed again in that case.
-Ravi
I created the volume originally this way:[2016-09-29 11:02:28.133879] W [fuse-bridge.c:767:fuse_attr_cNow, how is it possible to have a split brain if I stopped just ONE server which had just ONE of six bricks, and it was cleanly shut down with maintenance mode from ovirt?bk] 0-glusterfs-fuse: 8202: FSTAT() /ba2bd397-9222-424d-aecc-eb652 c0169d9/images/f02ac1ce-52cd- 4b81-8b29-f8006d0469e0/ff4e49c 6-3084-4234-80a1-18a67615c527 => -1 (Input/output error)
The message "W [MSGID: 108008] [afr-read-txn.c:244:afr_read_txn] 0-data_ssd-replicate-1: Unreadable subvolume -1 found with event generation 6 for gfid bf5922b7-19f3-4ce3-98df-71e981 ecca8d. (Possible split-brain)" repeated 11 times between [2016-09-29 11:02:28.126580] and [2016-09-29 11:02:28.517744]
[2016-09-29 11:02:28.518607] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-data_ssd-replicate-1: Failing STAT on gfid bf5922b7-19f3-4ce3-98df-71e981 ecca8d: split-brain observed. [Input/output error]
# gluster volume create data_ssd replica 3 arbiter 1 vm01.storage.billy:/gluster/ssd/data/brick vm02.storage.billy:/gluster/ss d/data/brick vm03.storage.billy:/gluster/ss d/data/arbiter_brick vm03.storage.billy:/gluster/ss d/data/brick vm04.storage.billy:/gluster/ss d/data/brick vm02.storage.billy:/gluster/ss d/data/arbiter_brick
# gluster volume set data_ssd group virt
# gluster volume set data_ssd storage.owner-uid 36 && gluster volume set data_ssd storage.owner-gid 36
# gluster volume start data_ssd
--
Davide FerrariSenior Systems Engineer
_______________________________________________
Users mailing list
Users@xxxxxxxxx
http://lists.ovirt.org/mailman/listinfo/users
--
Davide Ferrari
Senior Systems Engineer_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users