Re: [Possibile SPAM] Re: Problem with Gluster 3.12.4, VM and sharding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



An update:

I've tried, for my tests, to create the vm volume as

qemu-img create -f qcow2 -o preallocation=full gluster://gluster1/Test/Test-vda.img 20G

et voila !

No errors at all, neither in bricks' log file (the "link failed" message disappeared), neither in VM (no corruption and installed succesfully).

I'll do another test with a fully preallocated raw image.



Il 16/01/2018 16:31, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:

I've just done all the steps to reproduce the problem.

Tha VM volume has been created via "qemu-img create -f qcow2 Test-vda2.qcow2 20G" on the gluster volume mounted via FUSE. I've tried also to create the volume with preallocated metadata, which moves the problem a bit far away (in time). The volume is a replice 3 arbiter 1 volume hosted on XFS bricks.

Here are the informations:

[root@ovh-ov1 bricks]# gluster volume info gv2a2
 
Volume Name: gv2a2
Type: Replicate
Volume ID: 83c84774-2068-4bfc-b0b9-3e6b93705b9f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gluster1:/bricks/brick2/gv2a2
Brick2: gluster3:/bricks/brick3/gv2a2
Brick3: gluster2:/bricks/arbiter_brick_gv2a2/gv2a2 (arbiter)
Options Reconfigured:
storage.owner-gid: 107
storage.owner-uid: 107
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: enable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
nfs.disable: off
performance.client-io-threads: off

/var/log/glusterfs/glusterd.log:

[2018-01-15 14:17:50.196228] I [MSGID: 106488] [glusterd-handler.c:1548:__glusterd_handle_cli_get_volume] 0-management: Received get vol req
[2018-01-15 14:25:09.555214] I [MSGID: 106488] [glusterd-handler.c:1548:__glusterd_handle_cli_get_volume] 0-management: Received get vol req

(empty because today it's 2018-01-16)

/var/log/glusterfs/glustershd.log:

[2018-01-14 02:23:02.731245] I [glusterfsd-mgmt.c:1821:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing

(empty too)

/var/log/glusterfs/bricks/brick-brick2-gv2a2.log (the interested volume):

[2018-01-16 15:14:37.809965] I [MSGID: 115029] [server-handshake.c:793:server_setvolume] 0-gv2a2-server: accepted client from ovh-ov1-10302-2018/01/16-15:14:37:790306-gv2a2-client-0-0-0 (version: 3.12.4)
[2018-01-16 15:16:41.471751] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4 failed
[2018-01-16 15:16:41.471745] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4 ->  /bricks/brick2/gv2a2/.glusterfs/a0/14/a0144df3-8d89-4aed-872e-5fef141e9e1efailed  [File exists]
[2018-01-16 15:16:42.593392] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5 -> /bricks/brick2/gv2a2/.glusterfs/eb/04/eb044e6e-3a23-40a4-9ce1-f13af148eb67failed  [File exists]
[2018-01-16 15:16:42.593426] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5 failed
[2018-01-16 15:17:04.129593] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8 -> /bricks/brick2/gv2a2/.glusterfs/dc/92/dc92bd0a-0d46-4826-a4c9-d073a924dd8dfailed  [File exists]
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8 -> /bricks/brick2/gv2a2/.glusterfs/dc/92/dc92bd0a-0d46-4826-a4c9-d073a924dd8dfailed  [File exists]" repeated 5 times between [2018-01-16 15:17:04.129593] and [2018-01-16 15:17:04.129593]
[2018-01-16 15:17:04.129661] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8 failed
[2018-01-16 15:17:08.279162] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9 -> /bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed  [File exists]
[2018-01-16 15:17:08.279162] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9 -> /bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed  [File exists]
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9 -> /bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed  [File exists]" repeated 2 times between [2018-01-16 15:17:08.279162] and [2018-01-16 15:17:08.279162]

[2018-01-16 15:17:08.279177] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9 failed
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4 -> /bricks/brick2/gv2a2/.glusterfs/a0/14/a0144df3-8d89-4aed-872e-5fef141e9e1efailed  [File exists]" repeated 6 times between [2018-01-16 15:16:41.471745] and [2018-01-16 15:16:41.471807]
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5 -> /bricks/brick2/gv2a2/.glusterfs/eb/04/eb044e6e-3a23-40a4-9ce1-f13af148eb67failed  [File exists]" repeated 2 times between [2018-01-16 15:16:42.593392] and [2018-01-16 15:16:42.593430]
[2018-01-16 15:17:32.229689] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14 -> /bricks/brick2/gv2a2/.glusterfs/53/04/530449fa-d698-4928-a262-9a0234232323failed  [File exists]
[2018-01-16 15:17:32.229720] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14 failed
[2018-01-16 15:18:07.154330] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17 -> /bricks/brick2/gv2a2/.glusterfs/81/96/8196dd19-84bc-4c3d-909f-8792e9b4929dfailed  [File exists]
[2018-01-16 15:18:07.154375] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17 failed
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14 -> /bricks/brick2/gv2a2/.glusterfs/53/04/530449fa-d698-4928-a262-9a0234232323failed  [File exists]" repeated 7 times between [2018-01-16 15:17:32.229689] and [2018-01-16 15:17:32.229806]
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17 -> /bricks/brick2/gv2a2/.glusterfs/81/96/8196dd19-84bc-4c3d-909f-8792e9b4929dfailed  [File exists]" repeated 3 times between [2018-01-16 15:18:07.154330] and [2018-01-16 15:18:07.154357]
[2018-01-16 15:19:23.618794] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21 -> /bricks/brick2/gv2a2/.glusterfs/6d/02/6d02bd98-83de-43e8-a7af-b1d5f5160403failed  [File exists]
[2018-01-16 15:19:23.618827] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21 failed
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21 -> /bricks/brick2/gv2a2/.glusterfs/6d/02/6d02bd98-83de-43e8-a7af-b1d5f5160403failed  [File exists]" repeated 3 times between [2018-01-16 15:19:23.618794] and [2018-01-16 15:19:23.618794]

Thank you,


Il 16/01/2018 11:40, Krutika Dhananjay ha scritto:
Also to help isolate the component, could you answer these:

1. on a different volume with shard not enabled, do you see this issue?
2. on a plain 3-way replicated volume (no arbiter), do you see this issue?



On Tue, Jan 16, 2018 at 4:03 PM, Krutika Dhananjay <kdhananj@xxxxxxxxxx> wrote:
Please share the volume-info output and the logs under /var/log/glusterfs/ from all your nodes. for investigating the issue.

-Krutika

On Tue, Jan 16, 2018 at 1:30 PM, Ing. Luca Lazzeroni - Trend Servizi Srl <luca@xxxxxxxx> wrote:
Hi to everyone.

I've got a strange problem with a gluster setup: 3 nodes with Centos 7.4, Gluster 3.12.4 from Centos/Gluster repositories, QEMU-KVM version 2.9.0 (compiled from RHEL sources).

I'm running volumes in replica 3 arbiter 1 mode (but I've got a volume in "pure" replica 3 mode too). I've applied the "virt" group settings to my volumes since they host VM images.

If I try to install something (eg: Ubuntu Server 16.04.3) on a VM (and so I generate a bit of I/O inside it) and configure KVM to access gluster volume directly (via libvirt), install fails after a while because the disk content is corrupted. If I inspect the block inside the disk (by accessing the image directly from outside) I can found many files filled with "^@".

Also, what exactly do you mean by accessing the image directly from outside? Was it from the brick directories directly? Was it from the mount point of the volume? Could you elaborate? Which files exactly did you check?

-Krutika


If, instead, I configure KVM to access VM images via a FUSE mount, everything seems to work correctly.

Note that the problem with install is verified 100% time with QCOW2 image, while it appears only after with RAW disk images.

Is there anyone who experienced the same problem ?

Thank you,


--
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux