On Tue, Jun 28, 2022 at 12:13:06PM -0600, Jens Axboe wrote: > On 6/27/22 10:20 PM, Kent Overstreet wrote: > > On Mon, Jun 27, 2022 at 03:36:22PM +0800, Ming Lei wrote: > >> On Sun, Jun 26, 2022 at 04:14:58PM -0400, Kent Overstreet wrote: > >>> On Fri, Jun 24, 2022 at 10:12:52PM +0800, Ming Lei wrote: > >>>> Commit 7759eb23fd98 ("block: remove bio_rewind_iter()") removes > >>>> the similar API because the following reasons: > >>>> > >>>> ``` > >>>> It is pointed that bio_rewind_iter() is one very bad API[1]: > >>>> > >>>> 1) bio size may not be restored after rewinding > >>>> > >>>> 2) it causes some bogus change, such as 5151842b9d8732 (block: reset > >>>> bi_iter.bi_done after splitting bio) > >>>> > >>>> 3) rewinding really makes things complicated wrt. bio splitting > >>>> > >>>> 4) unnecessary updating of .bi_done in fast path > >>>> > >>>> [1] https://marc.info/?t=153549924200005&r=1&w=2 > >>>> > >>>> So this patch takes Kent's suggestion to restore one bio into its original > >>>> state via saving bio iterator(struct bvec_iter) in bio_integrity_prep(), > >>>> given now bio_rewind_iter() is only used by bio integrity code. > >>>> ``` > >>>> > >>>> However, it isn't easy to restore bio by saving 32 bytes bio->bi_iter, and saving > >>>> it only can't restore crypto and integrity info. > >>>> > >>>> Add bio_rewind() back for some use cases which may not be same with > >>>> previous generic case: > >>>> > >>>> 1) most of bio has fixed end sector since bio split is done from front of the bio, > >>>> if driver just records how many sectors between current bio's start sector and > >>>> the bio's end sector, the original position can be restored > >>>> > >>>> 2) if one bio's end sector won't change, usually bio_trim() isn't called, user can > >>>> restore original position by storing sectors from current ->bi_iter.bi_sector to > >>>> bio's end sector; together by saving bio size, 8 bytes can restore to > >>>> original bio. > >>>> > >>>> 3) dm's requeue use case: when BLK_STS_DM_REQUEUE happens, dm core needs to > >>>> restore to the original bio which represents current dm io to be requeued. > >>>> By storing sectors to the bio's end sector and dm io's size, > >>>> bio_rewind() can restore such original bio, then dm core code needn't to > >>>> allocate one bio beforehand just for handling BLK_STS_DM_REQUEUE which > >>>> is actually one unusual event. > >>>> > >>>> 4) Not like original rewind API, this one needn't to add .bi_done, and no any > >>>> effect on fast path > >>> > >>> It seems like perhaps the real issue here is that we need a real bio_iter, > >>> separate from bvec_iter, that also encapsulates iterating over integrity & > >>> fscrypt. > >> > >> Not mention bio_iter, bvec_iter has been 32 bytes, which is too big to > >> hold in per-io data structure. With this patch, 8bytes is enough > >> to rewind one bio if the end sector is fixed. > > > > Hold on though, does that check out? Why is that too big for per IO data > > structures? > > > > By definition these structures are only for IOs in flight, and we don't _want_ > > there to ever be very many of these or we're going to run into latency issues > > due to queue depth. > > It's much less about using whatever amount of memory for inflight IO, > and much more about not bloating fast path structures (of which the bio > is certainly one). All of this gunk has to be initialized for each IO, > and that's the real issue. Can't agree more, especially the initialization is just for the unusual DM_REQUEUE event(bio rewind is needed). Thanks, Ming