On Tue, Jul 24, 2018 at 9:48 PM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote: > hi, > Quite a few commands to monitor gluster at the moment take almost a > second to give output. Is this at the (most) minimum recommended cluster size? > Some categories of these commands: > 1) Any command that needs to do some sort of mount/glfs_init. > Examples: 1) heal info family of commands 2) statfs to find > space-availability etc (On my laptop replica 3 volume with all local bricks, > glfs_init takes 0.3 seconds on average) > 2) glusterd commands that need to wait for the previous command to unlock. > If the previous command is something related to lvm snapshot which takes > quite a few seconds, it would be even more time consuming. > > Nowadays container workloads have hundreds of volumes if not thousands. If > we want to serve any monitoring solution at this scale (I have seen > customers use upto 600 volumes at a time, it will only get bigger) and lets > say collecting metrics per volume takes 2 seconds per volume(Let us take the > worst example which has all major features enabled like > snapshot/geo-rep/quota etc etc), that will mean that it will take 20 minutes > to collect metrics of the cluster with 600 volumes. What are the ways in > which we can make this number more manageable? I was initially thinking may > be it is possible to get gd2 to execute commands in parallel on different > volumes, so potentially we could get this done in ~2 seconds. But quite a > few of the metrics need a mount or equivalent of a mount(glfs_init) to > collect different information like statfs, number of pending heals, quota > usage etc. This may lead to high memory usage as the size of the mounts tend > to be high. > I am not sure if starting from the "worst example" (it certainly is not) is a good place to start from. That said, for any environment with that number of disposable volumes, what kind of metrics do actually make any sense/impact? > I wanted to seek suggestions from others on how to come to a conclusion > about which path to take and what problems to solve. > > I will be happy to raise github issues based on our conclusions on this mail > thread. > > -- > Pranith > -- sankarshan mukhopadhyay <https://about.me/sankarshan.mukhopadhyay> _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-devel