On 12/2/20 1:03 PM, Mike Snitzer wrote: > What you've done here is fairly chaotic/disruptive: > 1) you emailed a patch out that isn't needed or ideal, I dealt already > staged a DM fix in linux-next for 5.10-rcX, see: > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.10-rcX&id=f28de262ddf09b635095bdeaf0e07ff507b3c41b Fine. I indeed didn't follow linux-dm.git. Sorry it's my fault. > 2) you replied to your patch and started referencing snippets of this > other patch's header (now staged for 5.10-rcX via Jens' block tree): > https://git.kernel.dk/cgit/linux-block/commit/?h=block-5.10&id=7e7986f9d3ba69a7375a41080a1f8c8012cb0923 > - why not reply to _that_ patch in response something stated in it? I just want to send in one email, which seems obviously improper. > 3) you started telling me, and others on these lists, why you think I > used lcm_not_zero(). > - reality is I wanted gcd() behavior, I just didn't reason through > the math to know it.. it was a stupid oversight on my part. Not > designed with precision. > 4) Why not check with me before you craft a patch like others reported > the problem to you? I know it logical to follow the chain of > implications based on one commit and see where else there might be > gaps but... it is strange to just pickup someone else's work like > that. > > All just _seems_ weird and overdone. This isn't the kind of help I > need. That said, I _do_ appreciate you looking at making blk IO polling > work with bio-based (and DM's bio splitting in particular), but the > lack of importance you put on DM's splitting below makes me concerned. > Though I have noticed this series discussion yesterday, I didn't read it thoroughly until today. When I noticed there may be one remained issue (I know it is not now), the patch, that is commit 22ada802ede8 has been adopt by Jens, so I send out a patch. If there's no Jens' reply, I will just reply under your mail. That's it. I have to admit that I get excited when I realized that I could send a patch. But it seems improper and more likely a misunderstanding. I apologize if I did wrong. > On Tue, Dec 01 2020 at 10:57pm -0500, > JeffleXu <jefflexu@xxxxxxxxxxxxxxxxx> wrote: > >> Actually in terms of this issue, I think the dilemma here is that, >> @chunk_sectors of dm device is mainly from two source. >> >> One is that from the underlying devices, which is calculated into one >> composed one in blk_stack_limits(). >> >>> commit 22ada802ede8 ("block: use lcm_not_zero() when stacking >>> chunk_sectors") broke chunk_sectors limit stacking. chunk_sectors must >>> reflect the most limited of all devices in the IO stack. >>> >>> Otherwise malformed IO may result. E.g.: prior to this fix, >>> ->chunk_sectors = lcm_not_zero(8, 128) would result in >>> blk_max_size_offset() splitting IO at 128 sectors rather than the >>> required more restrictive 8 sectors. >> >> For this part, technically I can't agree that 'chunk_sectors must >> reflect the most limited of all devices in the IO stack'. Even if the dm >> device advertises chunk_sectors of 128K when the limits of two >> underlying devices are 8K and 128K, and thus splitting is not done in dm >> device phase, the underlying devices will split by themselves. > > DM targets themselves _do_ require their own splitting. You cannot just > assume all IO that passes through DM targets doesn't need to be properly > sized on entry. Sure underlying devices will do their own splitting, > but those splits are based on their requirements. DM targets have their > own IO size limits too. Each layer needs to enforce and respect the > constraints of its layer while also factoring in those of the underlying > devices. > Got it. Thanks. >>> @@ -547,7 +547,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, >>> >>> t->io_min = max(t->io_min, b->io_min); >>> t->io_opt = lcm_not_zero(t->io_opt, b->io_opt); >>> - t->chunk_sectors = lcm_not_zero(t->chunk_sectors, b->chunk_sectors); >>> + >>> + /* Set non-power-of-2 compatible chunk_sectors boundary */ >>> + if (b->chunk_sectors) >>> + t->chunk_sectors = gcd(t->chunk_sectors, b->chunk_sectors); >> >> This may introduces a regression. > > Regression relative to what? 5.10 was the regression point. The commit > header you pasted into your reply clearly conveys that commit > 22ada802ede8 caused the regression. It makes no sense to try to create > some other regression point. You cannot have both from a single commit > in the most recent Linux 5.10 release. > > And so I have no idea why you think that restoring DM's _required_ > splitting constraints is somehow a regression. I mistakenly missed that all these changes are introduced in v5.10. Sorry for that. > >> Suppose the @chunk_sectors limits of >> two underlying devices are 8K and 128K, then @chunk_sectors of dm device >> is 8K after the fix. So even when a 128K sized bio is actually >> redirecting to the underlying device with 128K @chunk_sectors limit, >> this 128K sized bio will actually split into 16 split bios, each 8K >> sized。Obviously it is excessive split. And I think this is actually why >> lcm_not_zero(a, b) is used originally. > > No. Not excessive splitting, required splitting. And as I explained in > point 2) above, avoiding "excessive splits" isn't why lcm_not_zero() was > improperly used to stack chunk_sectors. This is indeed a difference between 5.9 and 5.10. In 5.10 there may be more small split bios, since a smaller chunk_sectors is applied for the underlying device with larger chunk_sectors (that is, the underlying device with 128K chunk_sectors). I can not say that more small split bios will cause worse performance since I have not tested it. > > Some DM targets really do require the IO be split on specific boundaries > -- however inconvenient for the underlying layers that DM splitting > might be. > >> The other one source is dm device itself. DM device can set @max_io_len >> through ->io_hint(), and then set @chunk_sectors from @max_io_len. > > ti->max_io_len should always be set in the DM target's .ctr > Yes I misremember it. > DM core takes care of applying max_io_len to chunk_sectors since 5.10, > you should know that given your patch is meant to fix commit > 882ec4e609c1 > > And for 5.11 I've staged a change to have it impose max_io_len in terms > of ->max_sectors too, see: > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.11&id=41dcb8f21a86edbe409b2bef9bb1df4cb9d66858 > Thanks. > One thing I clearly need to do moving forward is: always post my changes > to dm-devel; just so someone like yourself can follow along via email > client. I just assumed others who care about DM changes also track the > linux-dm.git tree's branches. Clearly not the best assumption or > practice on my part. I used Linus' tree as my code base, which seems improper... > >> This part is actually where 'chunk_sectors must reflect the most limited >> of all devices in the IO stack' is true, and we have to apply the most >> strict limitation here. This is actually what the following patch does. > > There is a very consistent and deliberate way that device limits must be > handled, sometimes I too have missteps but that doesn't change the fact > that there is a deliberate evenness to how limits are stacked. > blk_stack_limits() needs to be the authority on how these limits stack > up. So all DM's limits stacking wraps calls to it. My fix, shared in > point 1) above, restores that design pattern by _not_ having DM > duplicate a subset of how blk_stack_limits() does its stacking. > > Mike -- Thanks, Jeffle