On Tue, Jan 26, 2016 at 03:11:50AM +0000, Richard Wareing wrote: > > If there is one bucket per client and one thread per bucket, it would be > > difficult to scale as the number of clients increase. How can we do this > > better? > > On this note... consider that 10's of thousands of clients are not unrealistic in production :). Using a thread per bucket would also be....unwise.. > > On the idea in general, I'm just wondering if there's specific (real-world) cases where this has even been an issue where least-prio queuing hasn't been able to handle? Or is this more of a theoretical concern? I ask as I've not really encountered situations where I wished I could give more FOPs to SHD vs rebalance and such. > > In any event, it might be worth having Shreyas detail his throttling feature (that can throttle any directory hierarchy no less) to illustrate how a simpler design can achieve similar results to these more complicated (and it follows....bug prone) approaches. TBF isn't complicated at all - it's widely used for traffic shaping, cgroups, UML to rate limit disk I/O. But, I won't hurry up on things and wait to hear out from Shreyas regarding his throttling design. > > Richard > > ________________________________________ > From: gluster-devel-bounces@xxxxxxxxxxx [gluster-devel-bounces@xxxxxxxxxxx] on behalf of Vijay Bellur [vbellur@xxxxxxxxxx] > Sent: Monday, January 25, 2016 6:44 PM > To: Ravishankar N; Gluster Devel > Subject: Re: Throttling xlator on the bricks > > On 01/25/2016 12:36 AM, Ravishankar N wrote: > > Hi, > > > > We are planning to introduce a throttling xlator on the server (brick) > > process to regulate FOPS. The main motivation is to solve complaints about > > AFR selfheal taking too much of CPU resources. (due to too many fops for > > entry > > self-heal, rchecksums for data self-heal etc.) > > > I am wondering if we can re-use the same xlator for throttling > bandwidth, iops etc. in addition to fops. Based on admin configured > policies we could provide different upper thresholds to different > clients/tenants and this could prove to be an useful feature in > multitenant deployments to avoid starvation/noisy neighbor class of > problems. Has any thought gone in this direction? > > > > > The throttling is achieved using the Token Bucket Filter algorithm > > (TBF). TBF > > is already used by bitrot's bitd signer (which is a client process) in > > gluster to regulate the CPU intensive check-sum calculation. By putting the > > logic on the brick side, multiple clients- selfheal, bitrot, rebalance or > > even the mounts themselves can avail the benefits of throttling. > > > > The TBF algorithm in a nutshell is as follows: There is a bucket which > > is filled > > at a steady (configurable) rate with tokens. Each FOP will need a fixed > > amount > > of tokens to be processed. If the bucket has that many tokens, the FOP is > > allowed and that many tokens are removed from the bucket. If not, the FOP is > > queued until the bucket is filled. > > > > The xlator will need to reside above io-threads and can have different > > buckets, > > one per client. There has to be a communication mechanism between the > > client and > > the brick (IPC?) to tell what FOPS need to be regulated from it, and the > > no. of > > tokens needed etc. These need to be re configurable via appropriate > > mechanisms. > > Each bucket will have a token filler thread which will fill the tokens > > in it. > > If there is one bucket per client and one thread per bucket, it would be > difficult to scale as the number of clients increase. How can we do this > better? > > > The main thread will enqueue heals in a list in the bucket if there aren't > > enough tokens. Once the token filler detects some FOPS can be serviced, > > it will > > send a cond-broadcast to a dequeue thread which will process (stack > > wind) all > > the FOPS that have the required no. of tokens from all buckets. > > > > This is just a high level abstraction: requesting feedback on any aspect of > > this feature. what kind of mechanism is best between the client/bricks for > > tuning various parameters? What other requirements do you foresee? > > > > I am in favor of having administrator defined policies or templates > (collection of policies) being used to provide the tuning parameter per > client or a set of clients. We could even have a default template per > use case etc. Is there a specific need to have this negotiation between > clients and servers? > > Thanks, > Vijay > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.gluster.org_mailman_listinfo_gluster-2Ddevel&d=CwICAg&c=5VD0RTtNlTh3ycd41b3MUw&r=qJ8Lp7ySfpQklq3QZr44Iw&m=aQHnnoxK50Ebw77QHtp3ykjC976mJIt2qrIUzpqEViQ&s=Jitbldlbjwye6QI8V33ZoKtVt6-B64p2_-5piVlfXMQ&e= > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel