Re: How to Speed UP heal process in Glusterfs 3.10.1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>I was asking about reading data in same disperse set like 8+2 disperse config if one disk is replaced and when heal is in process and when client reads data which is available in rest of the 9 disks.

My use case is write heavy, we barely read data, so I do not know if read speed degrades during heal. But I know write speed do not change during heal.

How big is your files? How many files on average in each directory?

On Tue, Apr 18, 2017 at 11:36 AM, Amudhan P <amudhan83@xxxxxxxxx> wrote:

I actually used this (find /mnt/gluster -d -exec getfattr -h -n trusted.ec.heal {} \; > /dev/null
) command on a specific folder to trigger heal but it was also not showing any difference in speed.

I was asking about reading data in same disperse set like 8+2 disperse config if one disk is replaced and when heal is in process and when client reads data which is available in rest of the 9 disks. 

I am sure there was no bottleneck on network/disk IO in my case. 

I have tested 3.10.1 heal with disperse.shd-max-threads = 4. heal completed data size of 27GB in 13M15s. so it works well in a test environment but production environment it differs.



On Tue, Apr 18, 2017 at 12:47 PM, Serkan Çoban <cobanserkan@xxxxxxxxx> wrote:
You can increase heal speed by running below command from a client:
find /mnt/gluster -d -exec getfattr -h -n trusted.ec.heal {} \; > /dev/null

You can write a script with different folders to make it parallel.

In my case I see 6TB data was healed within 7-8 days with above command running.
>did you face any issue in reading data from rest of the good bricks in the set. like slow read < KB/s.
No, nodes generally have balanced network/disk  IO during heal..

You should make a detailed tests with non-prod cluster and try to find
optimum heal configuration for your use case..
Our new servers are on the way, in a couple of months I also will do
detailed tests with 3.10.x and parallel disperse heal, will post the
results here...


On Tue, Apr 18, 2017 at 9:51 AM, Amudhan P <amudhan83@xxxxxxxxx> wrote:
> Serkan,
>
> I have initially changed shd-max-thread 1 to 2 saw a little difference and
> changing it to 4 & 8. doesn't make any difference.
> disk write speed was about <1MB and data passed in thru network for healing
> node from other node were 4MB combined.
>
> Also, I tried ls -l from mount point to the folders and files which need to
> be healed but have not seen any difference in performance.
>
> But after 3 days of heal process running disk write speed was increased to 9
> - 11MB and data passed thru network for healing node from other node were
> 40MB combined.
>
> Still 14GB of data to be healed when comparing to other disks in set.
>
> I saw in another thread you also had the issue with heal speed, did you face
> any issue in reading data from rest of the good bricks in the set. like slow
> read < KB/s.
>
> On Mon, Apr 17, 2017 at 2:05 PM, Serkan Çoban <cobanserkan@xxxxxxxxx> wrote:
>>
>> Normally I see 8-10MB/sec/brick heal speed with gluster 3.7.11.
>> I tested parallel heal for disperse with version 3.9.0 and see that it
>> increase the heal speed to 20-40MB/sec
>> I tested with shd-max-threads 2,4,8 and saw that best performance
>> achieved with 2 or 4 threads.
>> you can try to start with 2 and test with 4 and 8 and compare the results?
>
>


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux