Re: How to Speed UP heal process in Glusterfs 3.10.1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Thu, Apr 20, 2017 at 1:24 PM, Amudhan P <amudhan83@xxxxxxxxx> wrote:
Hi Pranith,

> 1) At the moment heals happen in parallel only for files not directories. i.e. same shd process doesn't heal 2 directories at a time. But it   > can do as many file heals as shd-max-threads option. That could be the reason why Amudhan faced better performance after a while, but > it is a bit difficult to confirm without data.
  
       yes, your right disk has about 56153 files and each is under their own subdirectories. so equal or higher number folders will be there.

I have doubt when heal process creates a folder in disk does it also check with rest of the bricks on same disperse set to process and update xattr for folders and files when getting healed.

Yes in general most of the heal process involves contacting other bricks not just for creating directory but for other things as well like setting inode attributes/xattrs, data etc.
 

> 2) When a file is undergoing I/O both shd and mount will contend for locks to do I/O from bricks this probably is the reason for the           > slowness in I/O. it will last only until the file is healed in parallel with the I/O from users.

       I suggest there should be a mechanism in above case that should pause heal process and fulfill read request first and later continue with heal process. so user doesn't feel any difference in read speed.

But the read request can come at any point. If READ request comes after heal process takes locks, then the logic will become very convoluted to give priority to I/O. I think a better way would be to disable I/O from triggering heals for your case. This doesn't really fix the problem but it would reduce the probability of seeing this issue.
 

>3) Serkan, Amudhan, it would be nice to have feedback about what do you feel are the bottlenecks so that we can come up with next set >of performance improvements. One of the newer enhancements Sunil is working on is to be able to heal larger chunks in one go rather >than ~128KB chunks. It will be configurable upto 128MB I think, this will improve throughput. Next set of enhancements would >concentrate on reducing network round trips in doing heal and doing parallel heals of directories.

        I don't see any other bottlenecks other than what we discussed in this thread. heal should be faster when we have sufficient hardware power to do that. hope the newer enhancements would fulfill.


Coming to the original thread:

I think heal process is completed but still, there is a size difference of 14GB between healed disk and other good disks in the same set. 
so I have compared files between healed disk and good disk there are 3 files missing but it is a kb size files and this file was deleted in 3.7 but it's still in bricks.

Oh you have 3 files missing but no xattrs to indicate this? Could you let us know more about what are the parent directory xattrs on all the bricks where the file is missing?
 

Why is this size difference?

Could you find which files/directories are corresponding to the size difference? Also include .glusterfs in your commands for consideration.
 

regards
Amudhan P



On Wed, Apr 19, 2017 at 4:05 PM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:
Some thoughts based on this mail thread:
1) At the moment heals happen in parallel only for files not directories. i.e. same shd process doesn't heal 2 directories at a time. But it can do as many file heals as shd-max-threads option. That could be the reason why Amudhan faced better performance after a while, but it is a bit difficult to confirm without data.

2) When a file is undergoing I/O both shd and mount will contend for locks to do I/O from bricks this probably is the reason for the slowness in I/O. it will last only until the file is healed in parallel with the I/O from users.

3) Serkan, Amudhan, it would be nice to have feedback about what do you feel are the bottlenecks so that we can come up with next set of performance improvements. One of the newer enhancements Sunil is working on is to be able to heal larger chunks in one go rather than ~128KB chunks. It will be configurable upto 128MB I think, this will improve throughput. Next set of enhancements would concentrate on reducing network round trips in doing heal and doing parallel heals of directories.


On Tue, Apr 18, 2017 at 6:22 PM, Serkan Çoban <cobanserkan@xxxxxxxxx> wrote:
>Is this by design ? Is it tuneable ? 10MB/s/brick is too low for us.
>We will use 10GB ethernet, healing 10MB/s/brick would be a bottleneck.

That is the maximum if you are using EC volumes, I don't know about
other volume configurations.
With 3.9.0 parallel self heal of EC volumes should be faster though.



On Tue, Apr 18, 2017 at 1:38 PM, Gandalf Corvotempesta
<gandalf.corvotempesta@gmail.com> wrote:
> 2017-04-18 9:36 GMT+02:00 Serkan Çoban <cobanserkan@xxxxxxxxx>:
>> Nope, healing speed is 10MB/sec/brick, each brick heals with this
>> speed, so one brick or one server each will heal in one week...
>
> Is this by design ? Is it tuneable ? 10MB/s/brick is too low for us.
> We will use 10GB ethernet, healing 10MB/s/brick would be a bottleneck.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users



--
Pranith

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users




--
Pranith
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux