Re: GlusterFS as virtual machine storage

Pavel Szalbot <pavel.szalbot@xxxxxxxxx> · Fri, 8 Sep 2017 11:41:13 +0200

Back to replica 3 w/o arbiter. Two fio jobs running (direct=1 and
direct=0), rebooting one node... and VM dmesg looks like:

[  483.862664] blk_update_request: I/O error, dev vda, sector 23125016
[  483.898034] blk_update_request: I/O error, dev vda, sector 2161832
[  483.901103] blk_update_request: I/O error, dev vda, sector 2161832
[  483.904045] Aborting journal on device vda1-8.
[  483.906959] blk_update_request: I/O error, dev vda, sector 2099200
[  483.908306] blk_update_request: I/O error, dev vda, sector 2099200
[  483.909585] Buffer I/O error on dev vda1, logical block 262144,
lost sync page write
[  483.911121] blk_update_request: I/O error, dev vda, sector 2048
[  483.912192] blk_update_request: I/O error, dev vda, sector 2048
[  483.913221] Buffer I/O error on dev vda1, logical block 0, lost
sync page write
[  483.914546] EXT4-fs error (device vda1):
ext4_journal_check_start:56: Detected aborted journal
[  483.916230] EXT4-fs (vda1): Remounting filesystem read-only
[  483.917231] EXT4-fs (vda1): previous I/O error to superblock detected
[  483.917353] JBD2: Error -5 detected when updating journal
superblock for vda1-8.
[  483.921106] blk_update_request: I/O error, dev vda, sector 2048
[  483.922147] blk_update_request: I/O error, dev vda, sector 2048
[  483.923107] Buffer I/O error on dev vda1, logical block 0, lost
sync page write

Root fs is read-only even with 1s ping-timeout...

I really hope I have been idiot for almost a year now and someone
shows what am I doing completely wrong because I dream about joining
the hordes of fellow colleagues who store multiple VMs in gluster and
never had a problem with it. I also suspect the CentOS libvirt version
to be the cause.

-ps

On Fri, Sep 8, 2017 at 10:50 AM, Pavel Szalbot <pavel.szalbot@xxxxxxxxx> wrote:
> FYI I set up replica 3 (no arbiter this time), did the same thing -
> rebooted one node during lots of file IO on VM and IO stopped.
>
> As I mentioned either here or in another thread, this behavior is
> caused by high default of network.ping-timeout. My main problem used
> to be that setting it to low values like 3s or even 2s did not prevent
> the FS to be mounted as read-only in the past (at least with arbiter)
> and docs describe reconnect as very costly. If I set ping-timeout to
> 1s disaster of read-only mount is now prevented.
>
> However I find it very strange because in the past I actually did end
> up with read-only filesystem despite of the low ping-timeout.
>
> With replica 3 after node reboot iftop shows data flowing only to the
> one of remaining two nodes and there is no entry in heal info for the
> volume. Explanation would be very much appreciated ;-)
>
> Few minutes later I reverted back to replica 3 with arbiter (group
> virt, ping-timeout 1). All nodes are up. During first fio run, VM
> disconnected my ssh session, so I reconnected and saw ext4 problems in
> dmesg. I deleted the VM and started a new one. Glustershd.log fills
> with metadata heal shortly after fio job starts, but this time system
> is stable.
> Rebooting one of the nodes does not cause any problem (watching heal
> log, i/o on vm).
>
> So I decided to put more stress on VMs disk - I added second job with
> direct=1 and started it (now both are running) while one gluster node
> is still booting. What happened? One fio job reports "Bus error" and
> VM segfaults when trying to run dmesg...
>
> Is this gfapi related? Is this bug in arbiter?
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users