On Wed, Nov 20, 2019 at 06:50:02PM -0700, Sumanesh Samanta wrote: > >>You just see large IO size from driver side or device side, and do you > >>know why the big size IO is submitted to driver? Block layer's IO merge > >>contributes a lot for that, and IO merge usually starts to work > > May be it contribute to some extent, but I do not think streaming > applications have any incentive/reason to give small IO. An > application like Netflix need to read as much data as soon as possible > and serve to customers, they have no reason to read in small chunks. > In fact, they read in huge chunks. > That is why sequential IO is normally large chunks and random IO ( > which is more DB kind of operations ) is small IO. > Only exception I know of is database REDO logs, that are small > sequential IO, because there the DB is logging small transactions -- > but they go to SSDs. We can't cover all typical workloads here, and I can write a application easily to generate such sequential IO. Even though it is an unusual workloads or application, someone still may report it as one regression, so we can't risk to bypass .device_busy for HDD. > > >>Yeah, that is why my patches just bypass sdev->device_busy for SSD, and > >>looks you misunderstood the idea behind the patches, right? > > No, I got the idea, I am just saying most high end controllers have an > IO size limit , and even if the block layer merges IO, it does not > help, since they have to be broken to the max size the controller > supports. Also, most high end controllers have their own merging > logic, and hence not too much dependent on upper layer merging for > them If the controller's max size is exposed to block layer, block will make a proper size IO for controller. I believe all sane drivers do that. Anyway using per-LUN NONROT flag is flexible and reasonable, which won't need driver's change. Thanks, Ming