File Corruption when adding bricks to live replica volumes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



gluster 3.7.6

I seem to be able to reliably reproduce this. I have a replica 2 volume with 1 test VM image. While the VM is  running with heavy disk read/writes  (disk benchmark) I add a 3rd brick for replica 3:

gluster volume add-brick datastore1 replica 3  vng.proxmox.softlog:/vmdata/datastore1

I pretty much immediately get this:

gluster volume heal datastore1 info
Brick vna.proxmox.softlog:/vmdata/datastore1
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.20
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.22
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.55 - Possibly undergoing heal

/images/301/vm-301-disk-1.qcow2 - Possibly undergoing heal

Number of entries: 4

Brick vnb.proxmox.softlog:/vmdata/datastore1
/images/301/vm-301-disk-1.qcow2 - Possibly undergoing heal

/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.55 - Possibly undergoing heal

/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.20
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.22
Number of entries: 4

Brick vng.proxmox.softlog:/vmdata/datastore1
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.16
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.28
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.1
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.22
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.77
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.9
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.5
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.2
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.26
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.15
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.13
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.3
/.shard/d6aad699-d71d-4b35-b021-d35e5ff297c4.18
Number of entries: 13

The brick on vng is the new empty brick, but it has 13 shards being healed back to vna & vnb. That can't be right and if I leave it the VM becomes hopelessly corrupted. Also there are 81 shards in the files, they should all be queued for healing.

Additionally I get read errors when I run a qemu-img check on the VM image. If I remove the vng brick the problems are resolved.


If I do the same process while the VM is not running - i.e no files are being access, every proceeds as expect. All shard on vn & vnb are healed to vng,

-- 
Lindsay Mathieson
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux