Re: GlusterFS healing questions

Serkan Çoban <cobanserkan@xxxxxxxxx> · Thu, 9 Nov 2017 19:00:56 +0200

Hi,

You can set disperse.shd-max-threads to 2 or 4 in order to make heal
faster. This makes my heal times 2-3x faster.
Also you can play with disperse.self-heal-window-size to read more
bytes at one time, but i did not test it.

On Thu, Nov 9, 2017 at 4:47 PM, Xavi Hernandez <jahernan@xxxxxxxxxx> wrote:
> Hi Rolf,
>
> answers follow inline...
>
> On Thu, Nov 9, 2017 at 3:20 PM, Rolf Larsen <rolf@xxxxxxxx> wrote:
>>
>> Hi,
>>
>> We ran a test on GlusterFS 3.12.1 with erasurecoded volumes 8+2 with 10
>> bricks (default config,tested with 100gb, 200gb, 400gb bricksizes,10gbit
>> nics)
>>
>> 1.
>> Tests show that healing takes about double the time on healing 200gb vs
>> 100, and abit under the double on 400gb vs 200gb bricksizes. Is this
>> expected behaviour? In light of this would make 6,4 tb bricksizes use ~ 377
>> hours to heal.
>>
>> 100gb brick heal: 18 hours (8+2)
>> 200gb brick heal: 37 hours (8+2) +205%
>> 400gb brick heal: 59 hours (8+2) +159%
>>
>> Each 100gb is filled with 80000 x 10mb files (200gb is 2x and 400gb is 4x)
>
>
> If I understand it correctly, you are storing 80.000 files of 10 MB each
> when you are using 100GB bricks, but you double this value for 200GB bricks
> (160.000 files of 10MB each). And for 400GB bricks you create 320.000 files.
> Have I understood it correctly ?
>
> If this is true, it's normal that twice the space requires approximately
> twice the heal time. The healing time depends on the contents of the brick,
> not brick size. The same amount of files should take the same healing time,
> whatever the brick size is.
>
>>
>>
>> 2.
>> Are there any possibility to show the progress of a heal? As per now we
>> run gluster volume heal volume info, but this exit's when a brick is done
>> healing and when we run heal info again the command contiunes showing gfid's
>> until the brick is done again. This gives quite a bad picture of the status
>> of a heal.
>
>
> The output of 'gluster volume heal <volname> info' shows the list of files
> pending to be healed on each brick. The heal is complete when the list is
> empty. A faster alternative if you don't want to see the whole list of files
> is to use 'gluster volume heal <volname> statistics heal-count'. This will
> only show the number of pending files on each brick.
>
> I don't know any other way to track progress of self-heal.
>
>>
>>
>> 3.
>> What kind of config tweaks is recommended for these kind of EC volumes?
>
>
> I usually use the following values (specific only for ec):
>
> client.event-threads 4
> server.event-threads 4
> performance.client-io-threads on
>
> Regards,
>
> Xavi
>
>
>
>>
>>
>>
>> $ gluster volume info
>> Volume Name: test-ec-100g
>> Type: Disperse
>> Volume ID: 0254281d-2f6e-4ac4-a773-2b8e0eb8ab27
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (8 + 2) = 10
>> Transport-type: tcp
>> Bricks:
>> Brick1: dn-304:/mnt/test-ec-100/brick
>> Brick2: dn-305:/mnt/test-ec-100/brick
>> Brick3: dn-306:/mnt/test-ec-100/brick
>> Brick4: dn-307:/mnt/test-ec-100/brick
>> Brick5: dn-308:/mnt/test-ec-100/brick
>> Brick6: dn-309:/mnt/test-ec-100/brick
>> Brick7: dn-310:/mnt/test-ec-100/brick
>> Brick8: dn-311:/mnt/test-ec-2/brick
>> Brick9: dn-312:/mnt/test-ec-100/brick
>> Brick10: dn-313:/mnt/test-ec-100/brick
>> Options Reconfigured:
>> nfs.disable: on
>> transport.address-family: inet
>>
>> Volume Name: test-ec-200
>> Type: Disperse
>> Volume ID: 2ce23e32-7086-49c5-bf0c-7612fd7b3d5d
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (8 + 2) = 10
>> Transport-type: tcp
>> Bricks:
>> Brick1: dn-304:/mnt/test-ec-200/brick
>> Brick2: dn-305:/mnt/test-ec-200/brick
>> Brick3: dn-306:/mnt/test-ec-200/brick
>> Brick4: dn-307:/mnt/test-ec-200/brick
>> Brick5: dn-308:/mnt/test-ec-200/brick
>> Brick6: dn-309:/mnt/test-ec-200/brick
>> Brick7: dn-310:/mnt/test-ec-200/brick
>> Brick8: dn-311:/mnt/test-ec-200_2/brick
>> Brick9: dn-312:/mnt/test-ec-200/brick
>> Brick10: dn-313:/mnt/test-ec-200/brick
>> Options Reconfigured:
>> nfs.disable: on
>> transport.address-family: inet
>>
>> Volume Name: test-ec-400
>> Type: Disperse
>> Volume ID: fe00713a-7099-404d-ba52-46c6b4b6ecc0
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (8 + 2) = 10
>> Transport-type: tcp
>> Bricks:
>> Brick1: dn-304:/mnt/test-ec-400/brick
>> Brick2: dn-305:/mnt/test-ec-400/brick
>> Brick3: dn-306:/mnt/test-ec-400/brick
>> Brick4: dn-307:/mnt/test-ec-400/brick
>> Brick5: dn-308:/mnt/test-ec-400/brick
>> Brick6: dn-309:/mnt/test-ec-400/brick
>> Brick7: dn-310:/mnt/test-ec-400/brick
>> Brick8: dn-311:/mnt/test-ec-400_2/brick
>> Brick9: dn-312:/mnt/test-ec-400/brick
>> Brick10: dn-313:/mnt/test-ec-400/brick
>> Options Reconfigured:
>> nfs.disable: on
>> transport.address-family: inet
>>
>> --
>>
>> Regards
>> Rolf Arne Larsen
>> Ops Engineer
>> rolf@xxxxxxxxxxxxxx
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users@xxxxxxxxxxx
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users