Re: Implementing multiplexing for self heal client.

RAFI KC <rkavunga@xxxxxxxxxx> · Tue, 8 Jan 2019 12:23:04 +0530



    I have completed the patches and pushed for reviews. Please feel
      free to raise your review concerns/suggestions.
    

    https://review.gluster.org/#/c/glusterfs/+/21868 
    https://review.gluster.org/#/c/glusterfs/+/21907 
    https://review.gluster.org/#/c/glusterfs/+/21960
    https://review.gluster.org/#/c/glusterfs/+/21989/
    

    Regards
    Rafi KC

    
    On 12/24/18 3:58 PM, RAFI KC wrote:

    
      On 12/21/18 6:56 PM, Sankarshan Mukhopadhyay wrote:
      

      On Fri, Dec 21, 2018 at 6:30 PM RAFI KC
        <rkavunga@xxxxxxxxxx> wrote:
        

        Hi All,
          

          What is the problem?
          

          As of now self-heal client is running as one daemon per node,
          this means
          

          even if there are multiple volumes, there will only be one
          self-heal
          

          daemon. So to take effect of each configuration changes in the
          cluster,
          

          the self-heal has to be reconfigured. But it doesn't have
          ability to
          

          dynamically reconfigure. Which means when you have lot of
          volumes in the
          

          cluster, every management operation that involves
          configurations changes
          

          like volume start/stop, add/remove brick etc will result in
          self-heal
          

          daemon restart. If such operation is executed more often, it
          is not only
          

          slow down self-heal for a volume, but also increases the
          slef-heal logs
          

          substantially.
          

        What is the value of the number of volumes when you write "lot
        of
        

        volumes"? 1000 volumes, more etc
        

      Yes, more than 1000 volumes. It also depends on how often you
      execute glusterd management operations (mentioned above). Each
      time self heal daemon is restarted, it prints the entire graph.
      This graph traces in the log will contribute the majority it's
      size.
      

          How to fix it?
          

          We are planning to follow a similar procedure as attach/detach
          graphs
          

          dynamically which is similar to brick multiplex. The detailed
          steps is
          

          as below,
          

          1) First step is to make shd per volume daemon, to
          generate/reconfigure
          

          volfiles per volume basis .
          

              1.1) This will help to attach the volfiles easily to
          existing shd daemon
          

              1.2) This will help to send notification to shd daemon as
          each
          

          volinfo keeps the daemon object
          

              1.3) reconfiguring a particular subvolume is easier as we
          can check
          

          the topology better
          

              1.4) With this change the volfiles will be moved to
          workdir/vols/
          

          directory.
          

          2) Writing new rpc requests like attach/detach_client_graph
          function to
          

          support clients attach/detach
          

              2.1) Also functions like graph reconfigure,
          mgmt_getspec_cbk has to
          

          be modified
          

          3) Safely detaching a subvolume when there are pending frames
          to unwind.
          

              3.1) We can mark the client disconnected and make all the
          frames to
          

          unwind with ENOTCONN
          

              3.2) We can wait all the i/o to unwind until the new
          updated subvol
          

          attaches
          

          4) Handle scenarios like glusterd restart, node reboot, etc
          

          At the moment we are not planning to limit the number of heal
          subvolmes
          

          per process as, because with the current approach also for
          every volume
          

          heal was doing from a single process. We have not heared any
          major
          

          complains on this?
          

        Is the plan to not ever limit or, have a throttle set to a
        default
        

        high(er) value? How would system resources be impacted if the
        proposed
        

        design is implemented?
        

      The plan is to implement in a way that it can support more than
      one multiplexed self-heal daemon. The throttling function as of
      now returns the same process to multiplex, but it can be easily
      modified to create a new process.
      

      This multiplexing logic won't utilize any additional resources
      that it currently does.
      

      Rafi KC
      

      _______________________________________________
        

        Gluster-devel mailing list
        

        Gluster-devel@xxxxxxxxxxx
        

        https://lists.gluster.org/mailman/listinfo/gluster-devel
        

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel