On 6/27/22 10:20 PM, Kent Overstreet wrote: > On Mon, Jun 27, 2022 at 03:36:22PM +0800, Ming Lei wrote: >> On Sun, Jun 26, 2022 at 04:14:58PM -0400, Kent Overstreet wrote: >>> On Fri, Jun 24, 2022 at 10:12:52PM +0800, Ming Lei wrote: >>>> Commit 7759eb23fd98 ("block: remove bio_rewind_iter()") removes >>>> the similar API because the following reasons: >>>> >>>> ``` >>>> It is pointed that bio_rewind_iter() is one very bad API[1]: >>>> >>>> 1) bio size may not be restored after rewinding >>>> >>>> 2) it causes some bogus change, such as 5151842b9d8732 (block: reset >>>> bi_iter.bi_done after splitting bio) >>>> >>>> 3) rewinding really makes things complicated wrt. bio splitting >>>> >>>> 4) unnecessary updating of .bi_done in fast path >>>> >>>> [1] https://marc.info/?t=153549924200005&r=1&w=2 >>>> >>>> So this patch takes Kent's suggestion to restore one bio into its original >>>> state via saving bio iterator(struct bvec_iter) in bio_integrity_prep(), >>>> given now bio_rewind_iter() is only used by bio integrity code. >>>> ``` >>>> >>>> However, it isn't easy to restore bio by saving 32 bytes bio->bi_iter, and saving >>>> it only can't restore crypto and integrity info. >>>> >>>> Add bio_rewind() back for some use cases which may not be same with >>>> previous generic case: >>>> >>>> 1) most of bio has fixed end sector since bio split is done from front of the bio, >>>> if driver just records how many sectors between current bio's start sector and >>>> the bio's end sector, the original position can be restored >>>> >>>> 2) if one bio's end sector won't change, usually bio_trim() isn't called, user can >>>> restore original position by storing sectors from current ->bi_iter.bi_sector to >>>> bio's end sector; together by saving bio size, 8 bytes can restore to >>>> original bio. >>>> >>>> 3) dm's requeue use case: when BLK_STS_DM_REQUEUE happens, dm core needs to >>>> restore to the original bio which represents current dm io to be requeued. >>>> By storing sectors to the bio's end sector and dm io's size, >>>> bio_rewind() can restore such original bio, then dm core code needn't to >>>> allocate one bio beforehand just for handling BLK_STS_DM_REQUEUE which >>>> is actually one unusual event. >>>> >>>> 4) Not like original rewind API, this one needn't to add .bi_done, and no any >>>> effect on fast path >>> >>> It seems like perhaps the real issue here is that we need a real bio_iter, >>> separate from bvec_iter, that also encapsulates iterating over integrity & >>> fscrypt. >> >> Not mention bio_iter, bvec_iter has been 32 bytes, which is too big to >> hold in per-io data structure. With this patch, 8bytes is enough >> to rewind one bio if the end sector is fixed. > > Hold on though, does that check out? Why is that too big for per IO data > structures? > > By definition these structures are only for IOs in flight, and we don't _want_ > there to ever be very many of these or we're going to run into latency issues > due to queue depth. It's much less about using whatever amount of memory for inflight IO, and much more about not bloating fast path structures (of which the bio is certainly one). All of this gunk has to be initialized for each IO, and that's the real issue. Just look at the recent work for iov_iter and why ITER_UBUF makes sense to do. This is not a commentary on this patchset, just a general observation. Sizes of hot data structures DO matter, and quite a bit so. -- Jens Axboe