Hello, On Wed, Feb 09, 2022 at 05:14:27PM +0800, Ming Lei wrote: > Chunguang Xu has already observed this issue and fixed it in commit > 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO scenarios"). > However, that patch only covers bio splitting in __blk_queue_split(), and > we have other kind of bio splitting, such as bio_split() & > submit_bio_noacct() and other ways. I see. So, we can go for adding split handling to all other places or try to find a common spot. > This patch tries to fix the issue in one generic way by always charging > the bio for iops limit in blk_throtl_bio(). This way is reasonable: > re-submission & fast-cloned bio is charged if it is submitted to same > disk/queue, and BIO_THROTTLED will be cleared if bio->bi_bdev is changed. > > This new approach can get much more smooth/stable iops limit compared with > commit 4f1e9630afe6 ("blk-throtl: optimize IOPS throttle for large IO > scenarios") since that commit can't throttle current split bios actually. > > Also this way won't cause new double bio iops charge in > blk_throtl_dispatch_work_fn() in which blk_throtl_bio() won't be called > any more. But yeah, this should work too. This is simpler but more fragile given the twisted history around BIO_THROTTLED. I think the root cause of the convolution is that it's hooked at the wrong spot - it's sitting on top of the queue trying to guess what actually happens to the bios it sent down. Ideally, we probably wanna move this to rq-qos hooks which sit on the actual issue-to-the-device path. For now, I don't have a strong preference. This looks fine to me too. Please feel free to add Acked-by: Tejun Heo <tj@xxxxxxxxxx> for the blk-throtl patches. Thanks. -- tejun