On Fri, Dec 21, 2018 at 6:30 PM RAFI KC <rkavunga@xxxxxxxxxx> wrote: > > Hi All, > > What is the problem? > As of now self-heal client is running as one daemon per node, this means > even if there are multiple volumes, there will only be one self-heal > daemon. So to take effect of each configuration changes in the cluster, > the self-heal has to be reconfigured. But it doesn't have ability to > dynamically reconfigure. Which means when you have lot of volumes in the > cluster, every management operation that involves configurations changes > like volume start/stop, add/remove brick etc will result in self-heal > daemon restart. If such operation is executed more often, it is not only > slow down self-heal for a volume, but also increases the slef-heal logs > substantially. What is the value of the number of volumes when you write "lot of volumes"? 1000 volumes, more etc > > > How to fix it? > > We are planning to follow a similar procedure as attach/detach graphs > dynamically which is similar to brick multiplex. The detailed steps is > as below, > > > > > 1) First step is to make shd per volume daemon, to generate/reconfigure > volfiles per volume basis . > > 1.1) This will help to attach the volfiles easily to existing shd daemon > > 1.2) This will help to send notification to shd daemon as each > volinfo keeps the daemon object > > 1.3) reconfiguring a particular subvolume is easier as we can check > the topology better > > 1.4) With this change the volfiles will be moved to workdir/vols/ > directory. > > 2) Writing new rpc requests like attach/detach_client_graph function to > support clients attach/detach > > 2.1) Also functions like graph reconfigure, mgmt_getspec_cbk has to > be modified > > 3) Safely detaching a subvolume when there are pending frames to unwind. > > 3.1) We can mark the client disconnected and make all the frames to > unwind with ENOTCONN > > 3.2) We can wait all the i/o to unwind until the new updated subvol > attaches > > 4) Handle scenarios like glusterd restart, node reboot, etc > > > > At the moment we are not planning to limit the number of heal subvolmes > per process as, because with the current approach also for every volume > heal was doing from a single process. We have not heared any major > complains on this? Is the plan to not ever limit or, have a throttle set to a default high(er) value? How would system resources be impacted if the proposed design is implemented? _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-devel