Re: [Possibile SPAM] Re: Problem with Gluster 3.12.4, VM and sharding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for that input. Adding Niels since the issue is reproducible only with libgfapi.

-Krutika

On Thu, Jan 18, 2018 at 1:39 PM, Ing. Luca Lazzeroni - Trend Servizi Srl <luca@xxxxxxxx> wrote:

Another update.

I've setup a replica 3 volume without sharding and tried to install a VM on a qcow2 volume on that device; however the result is the same and the vm image has been corrupted, exactly at the same point.

Here's the volume info of the create volume:

Volume Name: gvtest
Type: Replicate
Volume ID: e2ddf694-ba46-4bc7-bc9c-e30803374e9d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: gluster1:/bricks/brick1/gvtest
Brick2: gluster2:/bricks/brick1/gvtest
Brick3: gluster3:/bricks/brick1/gvtest
Options Reconfigured:
user.cifs: off
features.shard: off
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: enable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off


Il 17/01/2018 14:51, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:

Hi,

after our IRC chat I've rebuilt a virtual machine with FUSE based virtual disk. Everything worked flawlessly.

Now I'm sending you the output of the requested getfattr command on the disk image:

# file: TestFUSE-vda.qcow2
trusted.afr.dirty=0x000000000000000000000000
trusted.gfid=0x40ffafbbe987445692bb31295fa40105
trusted.gfid2path.dc9dde61f0b77eab=0x31326533323631662d373839332d346262302d383738632d3966623765306232336263652f54657374465553452d7664612e71636f7732
trusted.glusterfs.shard.block-size=0x0000000004000000
trusted.glusterfs.shard.file-size=0x00000000c15300000000000000000000000000000060be900000000000000000

Hope this helps.



Il 17/01/2018 11:37, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:

I actually use FUSE and it works. If i try to use "libgfapi" direct interface to gluster in qemu-kvm, the problem appears.



Il 17/01/2018 11:35, Krutika Dhananjay ha scritto:
Really? Then which protocol exactly do you see this issue with? libgfapi? NFS?

-Krutika

On Wed, Jan 17, 2018 at 3:59 PM, Ing. Luca Lazzeroni - Trend Servizi Srl <luca@xxxxxxxx> wrote:

Of course. Here's the full log. Please, note that in FUSE mode everything works apparently without problems. I've installed 4 vm and updated them without problems.



Il 17/01/2018 11:00, Krutika Dhananjay ha scritto:


On Tue, Jan 16, 2018 at 10:47 PM, Ing. Luca Lazzeroni - Trend Servizi Srl <luca@xxxxxxxx> wrote:

I've made the test with raw image format (preallocated too) and the corruption problem is still there (but without errors in bricks' log file).

What does the "link" error in bricks log files means ?

I've seen the source code looking for the lines where it happens and it seems a warning (it doesn't imply a failure).


Indeed, it only represents a transient state when the shards are created for the first time and does not indicate a failure.
Could you also get the logs of the gluster fuse mount process? It should be under /var/log/glusterfs of your client machine with the filename as a hyphenated mount point path.

For example, if your volume was mounted at /mnt/glusterfs, then your log file would be named mnt-glusterfs.log.

-Krutika



Il 16/01/2018 17:39, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:

An update:

I've tried, for my tests, to create the vm volume as

qemu-img create -f qcow2 -o preallocation=full gluster://gluster1/Test/Test-vda.img 20G

et voila !

No errors at all, neither in bricks' log file (the "link failed" message disappeared), neither in VM (no corruption and installed succesfully).

I'll do another test with a fully preallocated raw image.



Il 16/01/2018 16:31, Ing. Luca Lazzeroni - Trend Servizi Srl ha scritto:

I've just done all the steps to reproduce the problem.

Tha VM volume has been created via "qemu-img create -f qcow2 Test-vda2.qcow2 20G" on the gluster volume mounted via FUSE. I've tried also to create the volume with preallocated metadata, which moves the problem a bit far away (in time). The volume is a replice 3 arbiter 1 volume hosted on XFS bricks.

Here are the informations:

[root@ovh-ov1 bricks]# gluster volume info gv2a2
 
Volume Name: gv2a2
Type: Replicate
Volume ID: 83c84774-2068-4bfc-b0b9-3e6b93705b9f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gluster1:/bricks/brick2/gv2a2
Brick2: gluster3:/bricks/brick3/gv2a2
Brick3: gluster2:/bricks/arbiter_brick_gv2a2/gv2a2 (arbiter)
Options Reconfigured:
storage.owner-gid: 107
storage.owner-uid: 107
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: enable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
nfs.disable: off
performance.client-io-threads: off

/var/log/glusterfs/glusterd.log:

[2018-01-15 14:17:50.196228] I [MSGID: 106488] [glusterd-handler.c:1548:__glusterd_handle_cli_get_volume] 0-management: Received get vol req
[2018-01-15 14:25:09.555214] I [MSGID: 106488] [glusterd-handler.c:1548:__glusterd_handle_cli_get_volume] 0-management: Received get vol req

(empty because today it's 2018-01-16)

/var/log/glusterfs/glustershd.log:

[2018-01-14 02:23:02.731245] I [glusterfsd-mgmt.c:1821:mgmt_getspec_cbk] 0-glusterfs: No change in volfile,continuing

(empty too)

/var/log/glusterfs/bricks/brick-brick2-gv2a2.log (the interested volume):

[2018-01-16 15:14:37.809965] I [MSGID: 115029] [server-handshake.c:793:server_setvolume] 0-gv2a2-server: accepted client from ovh-ov1-10302-2018/01/16-15:14:37:790306-gv2a2-client-0-0-0 (version: 3.12.4)
[2018-01-16 15:16:41.471751] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4 failed
[2018-01-16 15:16:41.471745] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4 ->  /bricks/brick2/gv2a2/.glusterfs/a0/14/a0144df3-8d89-4aed-872e-5fef141e9e1efailed  [File exists]
[2018-01-16 15:16:42.593392] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5 -> /bricks/brick2/gv2a2/.glusterfs/eb/04/eb044e6e-3a23-40a4-9ce1-f13af148eb67failed  [File exists]
[2018-01-16 15:16:42.593426] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5 failed
[2018-01-16 15:17:04.129593] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8 -> /bricks/brick2/gv2a2/.glusterfs/dc/92/dc92bd0a-0d46-4826-a4c9-d073a924dd8dfailed  [File exists]
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8 -> /bricks/brick2/gv2a2/.glusterfs/dc/92/dc92bd0a-0d46-4826-a4c9-d073a924dd8dfailed  [File exists]" repeated 5 times between [2018-01-16 15:17:04.129593] and [2018-01-16 15:17:04.129593]
[2018-01-16 15:17:04.129661] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.8 failed
[2018-01-16 15:17:08.279162] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9 -> /bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed  [File exists]
[2018-01-16 15:17:08.279162] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9 -> /bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed  [File exists]
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9 -> /bricks/brick2/gv2a2/.glusterfs/c9/b7/c9b71b00-a09f-4df1-b874-041820ca8241failed  [File exists]" repeated 2 times between [2018-01-16 15:17:08.279162] and [2018-01-16 15:17:08.279162]

[2018-01-16 15:17:08.279177] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.9 failed
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.4 -> /bricks/brick2/gv2a2/.glusterfs/a0/14/a0144df3-8d89-4aed-872e-5fef141e9e1efailed  [File exists]" repeated 6 times between [2018-01-16 15:16:41.471745] and [2018-01-16 15:16:41.471807]
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.5 -> /bricks/brick2/gv2a2/.glusterfs/eb/04/eb044e6e-3a23-40a4-9ce1-f13af148eb67failed  [File exists]" repeated 2 times between [2018-01-16 15:16:42.593392] and [2018-01-16 15:16:42.593430]
[2018-01-16 15:17:32.229689] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14 -> /bricks/brick2/gv2a2/.glusterfs/53/04/530449fa-d698-4928-a262-9a0234232323failed  [File exists]
[2018-01-16 15:17:32.229720] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14 failed
[2018-01-16 15:18:07.154330] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17 -> /bricks/brick2/gv2a2/.glusterfs/81/96/8196dd19-84bc-4c3d-909f-8792e9b4929dfailed  [File exists]
[2018-01-16 15:18:07.154375] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17 failed
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.14 -> /bricks/brick2/gv2a2/.glusterfs/53/04/530449fa-d698-4928-a262-9a0234232323failed  [File exists]" repeated 7 times between [2018-01-16 15:17:32.229689] and [2018-01-16 15:17:32.229806]
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.17 -> /bricks/brick2/gv2a2/.glusterfs/81/96/8196dd19-84bc-4c3d-909f-8792e9b4929dfailed  [File exists]" repeated 3 times between [2018-01-16 15:18:07.154330] and [2018-01-16 15:18:07.154357]
[2018-01-16 15:19:23.618794] W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21 -> /bricks/brick2/gv2a2/.glusterfs/6d/02/6d02bd98-83de-43e8-a7af-b1d5f5160403failed  [File exists]
[2018-01-16 15:19:23.618827] E [MSGID: 113020] [posix.c:1485:posix_mknod] 0-gv2a2-posix: setting gfid on /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21 failed
The message "W [MSGID: 113096] [posix-handle.c:770:posix_handle_hard] 0-gv2a2-posix: link /bricks/brick2/gv2a2/.shard/62335cb9-c7b5-4735-a879-59cff93fe622.21 -> /bricks/brick2/gv2a2/.glusterfs/6d/02/6d02bd98-83de-43e8-a7af-b1d5f5160403failed  [File exists]" repeated 3 times between [2018-01-16 15:19:23.618794] and [2018-01-16 15:19:23.618794]

Thank you,


Il 16/01/2018 11:40, Krutika Dhananjay ha scritto:
Also to help isolate the component, could you answer these:

1. on a different volume with shard not enabled, do you see this issue?
2. on a plain 3-way replicated volume (no arbiter), do you see this issue?



On Tue, Jan 16, 2018 at 4:03 PM, Krutika Dhananjay <kdhananj@xxxxxxxxxx> wrote:
Please share the volume-info output and the logs under /var/log/glusterfs/ from all your nodes. for investigating the issue.

-Krutika

On Tue, Jan 16, 2018 at 1:30 PM, Ing. Luca Lazzeroni - Trend Servizi Srl <luca@xxxxxxxx> wrote:
Hi to everyone.

I've got a strange problem with a gluster setup: 3 nodes with Centos 7.4, Gluster 3.12.4 from Centos/Gluster repositories, QEMU-KVM version 2.9.0 (compiled from RHEL sources).

I'm running volumes in replica 3 arbiter 1 mode (but I've got a volume in "pure" replica 3 mode too). I've applied the "virt" group settings to my volumes since they host VM images.

If I try to install something (eg: Ubuntu Server 16.04.3) on a VM (and so I generate a bit of I/O inside it) and configure KVM to access gluster volume directly (via libvirt), install fails after a while because the disk content is corrupted. If I inspect the block inside the disk (by accessing the image directly from outside) I can found many files filled with "^@".

Also, what exactly do you mean by accessing the image directly from outside? Was it from the brick directories directly? Was it from the mount point of the volume? Could you elaborate? Which files exactly did you check?

-Krutika


If, instead, I configure KVM to access VM images via a FUSE mount, everything seems to work correctly.

Note that the problem with install is verified 100% time with QCOW2 image, while it appears only after with RAW disk images.

Is there anyone who experienced the same problem ?

Thank you,


--
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users


-- 
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it


-- 
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Ing. Luca Lazzeroni
Responsabile Ricerca e Sviluppo
Trend Servizi Srl
Tel: 0376/631761
Web: https://www.trendservizi.it

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux