Re: Deleting large files on sharded volume hangs and doesn't delete shards

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry for the slow response. Your hunch was right. It seems to be a problem between tiering and sharding.

I untiered the volume and the symptom vanished. I then deleted and recreated the volume entirely (without tiering) in order to cleanup the orphaned shards.

-Walter Deignan
-Uline IT, Systems Architect




From:        Nithya Balachandran <nbalacha@xxxxxxxxxx>
To:        Walter Deignan <WDeignan@xxxxxxxxx>
Cc:        gluster-users <gluster-users@xxxxxxxxxxx>
Date:        05/17/2017 10:17 PM
Subject:        Re: Deleting large files on sharded volume hangs and doesn't delete shards




I don't think we have tested shards with a tiered volume.  Do you see such issues on non-tiered sharded volumes?

Regards,
Nithya

On 18 May 2017 at 00:51, Walter Deignan <WDeignan@xxxxxxxxx> wrote:
I have a reproducible issue where attempting to delete a file large enough to have been sharded hangs. I can't kill the 'rm' command and eventually am forced to reboot the client (which in this case is also part of the gluster cluster). After the node finishes rebooting I can see that while the file front-end is gone, the back-end shards are still present.

Is this a known issue? Any way to get around it?


----------------------------------------------


[root@dc-vihi19 ~]# gluster volume info gv0


Volume Name: gv0

Type: Tier

Volume ID: d42e366f-381d-4787-bcc5-cb6770cb7d58

Status: Started

Snapshot Count: 0

Number of Bricks: 24

Transport-type: tcp

Hot Tier :

Hot Tier Type : Distributed-Replicate

Number of Bricks: 4 x 2 = 8

Brick1: dc-vihi71:/gluster/bricks/brick4/data

Brick2: dc-vihi19:/gluster/bricks/brick4/data

Brick3: dc-vihi70:/gluster/bricks/brick4/data

Brick4: dc-vihi19:/gluster/bricks/brick3/data

Brick5: dc-vihi71:/gluster/bricks/brick3/data

Brick6: dc-vihi19:/gluster/bricks/brick2/data

Brick7: dc-vihi70:/gluster/bricks/brick3/data

Brick8: dc-vihi19:/gluster/bricks/brick1/data

Cold Tier:

Cold Tier Type : Distributed-Replicate

Number of Bricks: 8 x 2 = 16

Brick9: dc-vihi19:/gluster/bricks/brick5/data

Brick10: dc-vihi70:/gluster/bricks/brick1/data

Brick11: dc-vihi19:/gluster/bricks/brick6/data

Brick12: dc-vihi71:/gluster/bricks/brick1/data

Brick13: dc-vihi19:/gluster/bricks/brick7/data

Brick14: dc-vihi70:/gluster/bricks/brick2/data

Brick15: dc-vihi19:/gluster/bricks/brick8/data

Brick16: dc-vihi71:/gluster/bricks/brick2/data

Brick17: dc-vihi19:/gluster/bricks/brick9/data

Brick18: dc-vihi70:/gluster/bricks/brick5/data

Brick19: dc-vihi19:/gluster/bricks/brick10/data

Brick20: dc-vihi71:/gluster/bricks/brick5/data

Brick21: dc-vihi19:/gluster/bricks/brick11/data

Brick22: dc-vihi70:/gluster/bricks/brick6/data

Brick23: dc-vihi19:/gluster/bricks/brick12/data

Brick24: dc-vihi71:/gluster/bricks/brick6/data

Options Reconfigured:

nfs.disable: on

transport.address-family: inet

features.ctr-enabled: on

cluster.tier-mode: cache

features.shard: on

features.shard-block-size: 512MB

network.ping-timeout: 5

cluster.server-quorum-ratio: 51%


[root@dc-vihi19 temp]# ls -lh

total 26G

-rw-rw-rw-. 1 root root 31G May 17 10:38 win7.qcow2

[root@dc-vihi19 temp]# getfattr -n glusterfs.gfid.string win7.qcow2

# file: win7.qcow2

glusterfs.gfid.string="7f4a0fea-72c0-41e4-97a5-6297be0a9142"


[root@dc-vihi19 temp]# rm win7.qcow2

rm: remove regular file âwin7.qcow2â? y


*Process hangs and can't be killed. A reboot later...*


login as: root

Authenticating with public key "rsa-key-20170510"

Last login: Wed May 17 14:04:29 2017 from ******

[root@dc-vihi19 ~]# find /gluster/bricks -name "7f4a0fea-72c0-41e4-97a5-6297be0a9142*"

/gluster/bricks/brick1/data/.shard/7f4a0fea-72c0-41e4-97a5-6297be0a9142.23

/gluster/bricks/brick1/data/.shard/7f4a0fea-72c0-41e4-97a5-6297be0a9142.35

/gluster/bricks/brick2/data/.shard/7f4a0fea-72c0-41e4-97a5-6297be0a9142.52

/gluster/bricks/brick2/data/.shard/7f4a0fea-72c0-41e4-97a5-6297be0a9142.29

/gluster/bricks/brick2/data/.shard/7f4a0fea-72c0-41e4-97a5-6297be0a9142.22

/gluster/bricks/brick2/data/.shard/7f4a0fea-72c0-41e4-97a5-6297be0a9142.24


and so on...



-Walter Deignan
-Uline IT, Systems Architect

_______________________________________________
Gluster-users mailing list

Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux