Re: GlusterFS healing questions

ingard@xxxxxxxx · Thu, 9 Nov 2017 19:47:17 +0100

Someone on the #gluster-users irc channel said the following : 
"Decreasing features.locks-revocation-max-blocked to an absurdly low number is letting our distributed-disperse set heal again."

Is this something to concider? Does anyone else have experience with tweaking this to speed up healing?

Sent from my iPhone

> On 9 Nov 2017, at 18:00, Serkan Çoban <cobanserkan@xxxxxxxxx> wrote:
> 
> Hi,
> 
> You can set disperse.shd-max-threads to 2 or 4 in order to make heal
> faster. This makes my heal times 2-3x faster.
> Also you can play with disperse.self-heal-window-size to read more
> bytes at one time, but i did not test it.
> 
>> On Thu, Nov 9, 2017 at 4:47 PM, Xavi Hernandez <jahernan@xxxxxxxxxx> wrote:
>> Hi Rolf,
>> 
>> answers follow inline...
>> 
>>> On Thu, Nov 9, 2017 at 3:20 PM, Rolf Larsen <rolf@xxxxxxxx> wrote:
>>> 
>>> Hi,
>>> 
>>> We ran a test on GlusterFS 3.12.1 with erasurecoded volumes 8+2 with 10
>>> bricks (default config,tested with 100gb, 200gb, 400gb bricksizes,10gbit
>>> nics)
>>> 
>>> 1.
>>> Tests show that healing takes about double the time on healing 200gb vs
>>> 100, and abit under the double on 400gb vs 200gb bricksizes. Is this
>>> expected behaviour? In light of this would make 6,4 tb bricksizes use ~ 377
>>> hours to heal.
>>> 
>>> 100gb brick heal: 18 hours (8+2)
>>> 200gb brick heal: 37 hours (8+2) +205%
>>> 400gb brick heal: 59 hours (8+2) +159%
>>> 
>>> Each 100gb is filled with 80000 x 10mb files (200gb is 2x and 400gb is 4x)
>> 
>> 
>> If I understand it correctly, you are storing 80.000 files of 10 MB each
>> when you are using 100GB bricks, but you double this value for 200GB bricks
>> (160.000 files of 10MB each). And for 400GB bricks you create 320.000 files.
>> Have I understood it correctly ?
>> 
>> If this is true, it's normal that twice the space requires approximately
>> twice the heal time. The healing time depends on the contents of the brick,
>> not brick size. The same amount of files should take the same healing time,
>> whatever the brick size is.
>> 
>>> 
>>> 
>>> 2.
>>> Are there any possibility to show the progress of a heal? As per now we
>>> run gluster volume heal volume info, but this exit's when a brick is done
>>> healing and when we run heal info again the command contiunes showing gfid's
>>> until the brick is done again. This gives quite a bad picture of the status
>>> of a heal.
>> 
>> 
>> The output of 'gluster volume heal <volname> info' shows the list of files
>> pending to be healed on each brick. The heal is complete when the list is
>> empty. A faster alternative if you don't want to see the whole list of files
>> is to use 'gluster volume heal <volname> statistics heal-count'. This will
>> only show the number of pending files on each brick.
>> 
>> I don't know any other way to track progress of self-heal.
>> 
>>> 
>>> 
>>> 3.
>>> What kind of config tweaks is recommended for these kind of EC volumes?
>> 
>> 
>> I usually use the following values (specific only for ec):
>> 
>> client.event-threads 4
>> server.event-threads 4
>> performance.client-io-threads on
>> 
>> Regards,
>> 
>> Xavi
>> 
>> 
>> 
>>> 
>>> 
>>> 
>>> $ gluster volume info
>>> Volume Name: test-ec-100g
>>> Type: Disperse
>>> Volume ID: 0254281d-2f6e-4ac4-a773-2b8e0eb8ab27
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (8 + 2) = 10
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: dn-304:/mnt/test-ec-100/brick
>>> Brick2: dn-305:/mnt/test-ec-100/brick
>>> Brick3: dn-306:/mnt/test-ec-100/brick
>>> Brick4: dn-307:/mnt/test-ec-100/brick
>>> Brick5: dn-308:/mnt/test-ec-100/brick
>>> Brick6: dn-309:/mnt/test-ec-100/brick
>>> Brick7: dn-310:/mnt/test-ec-100/brick
>>> Brick8: dn-311:/mnt/test-ec-2/brick
>>> Brick9: dn-312:/mnt/test-ec-100/brick
>>> Brick10: dn-313:/mnt/test-ec-100/brick
>>> Options Reconfigured:
>>> nfs.disable: on
>>> transport.address-family: inet
>>> 
>>> Volume Name: test-ec-200
>>> Type: Disperse
>>> Volume ID: 2ce23e32-7086-49c5-bf0c-7612fd7b3d5d
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (8 + 2) = 10
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: dn-304:/mnt/test-ec-200/brick
>>> Brick2: dn-305:/mnt/test-ec-200/brick
>>> Brick3: dn-306:/mnt/test-ec-200/brick
>>> Brick4: dn-307:/mnt/test-ec-200/brick
>>> Brick5: dn-308:/mnt/test-ec-200/brick
>>> Brick6: dn-309:/mnt/test-ec-200/brick
>>> Brick7: dn-310:/mnt/test-ec-200/brick
>>> Brick8: dn-311:/mnt/test-ec-200_2/brick
>>> Brick9: dn-312:/mnt/test-ec-200/brick
>>> Brick10: dn-313:/mnt/test-ec-200/brick
>>> Options Reconfigured:
>>> nfs.disable: on
>>> transport.address-family: inet
>>> 
>>> Volume Name: test-ec-400
>>> Type: Disperse
>>> Volume ID: fe00713a-7099-404d-ba52-46c6b4b6ecc0
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (8 + 2) = 10
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: dn-304:/mnt/test-ec-400/brick
>>> Brick2: dn-305:/mnt/test-ec-400/brick
>>> Brick3: dn-306:/mnt/test-ec-400/brick
>>> Brick4: dn-307:/mnt/test-ec-400/brick
>>> Brick5: dn-308:/mnt/test-ec-400/brick
>>> Brick6: dn-309:/mnt/test-ec-400/brick
>>> Brick7: dn-310:/mnt/test-ec-400/brick
>>> Brick8: dn-311:/mnt/test-ec-400_2/brick
>>> Brick9: dn-312:/mnt/test-ec-400/brick
>>> Brick10: dn-313:/mnt/test-ec-400/brick
>>> Options Reconfigured:
>>> nfs.disable: on
>>> transport.address-family: inet
>>> 
>>> --
>>> 
>>> Regards
>>> Rolf Arne Larsen
>>> Ops Engineer
>>> rolf@xxxxxxxxxxxxxx
>>> 
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users@xxxxxxxxxxx
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users@xxxxxxxxxxx
>> http://lists.gluster.org/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users