On Tue, 2019-07-02 at 17:24 +0200, Dan van der Ster wrote: > Hi, > > Are there any plans to implement a per-client throttle on mds client requests? > > We just had an interesting case where a new cephfs user was hammering > an mds from several hosts. In the end we found that their code was > doing: > > while d=getafewbytesofdata(): > f=open(file.dat) > f.append(d) > f.close() > > By changing their code to: > > f=open(file.dat) > while d=getafewbytesofdata(): > f.append(d) > f.close() > > it completely removes their load on the mds (for obvious reasons). > > In a multi-user environment it's hard to scrutinize every user's > application, so we'd prefer to just throttle down the client req rates > (and let them suffer from the poor performance). > > Thoughts? > > (cc'ing Xuehan) It sounds like a reasonable thing to do at first glance. There was a patchset recently by Xuehan Xu to add a new io controller policy for cephfs, but that was more focused around OSD ops on behalf of cephfs clients, fwiw, but that's not quite what you're asking about. The challenge with all of these sorts of throttling schemes is how to parcel things out to individual clients. MDS/OSD ops are not a discrete resource, and it's difficult to gauge how much to allocate to each client. I think if we were going to do something along these lines, it'd be good to work out how you'd throttle both MDS and OSD ops to keep a lid on things. That said, this is not a trivial problem to tackle, IMO. Some questions to get you started should you choose to pursue this: - Will you throttle these ops at the MDS or on the clients? Ditto for the OSDs... - How will it work? Will there be a fixed cap of some sort for a given amount of time, or are you more looking to just delay processing ops for a single client when it's "too busy"? - If you're thinking of something more like a cgroup, how will you determine how large a pool of operations you will have, and can parcel out to each client? If you've parceled out 100% of your MDS ops budget, how will you rebalance things when new clients are added or removed from the cluster? - if a client is holding a file lock, then throttling it could delay it releasing locks and that could slow down other (mostly idle) clients that are contending for it. Do we care? How will we deal with that situation if so? - Would you need separate tunables for OSD and MDS ops, or is there some way to tune both under a single knob? -- Jeff Layton <jlayton@xxxxxxxxxx> _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx