On Fri, Sep 23, 2022 at 05:22:17AM +0100, Al Viro wrote: > > Add a iov_iter_unpin_pages that does the right thing based on the > > type. (block will need a modified copy of it as it doesn't keep > > the pages array around, but logic will be the same). > > Huh? You want to keep the type (+ direction) of iov_iter in any structure > a page reference coming from iov_iter_get_pages might end up in? IDGI... Why would I? We generall do have or should have the iov_iter around. And for the common case where we don't (bios) we can carry that information in the bio as it needs a special unmap helper anyway. But if you don't want to use the iov_iter for some reason, we'll just need to condense the information to a flags variable and then pass that. > > BTW, speaking of lifetime rules - am I right assuming that fd_execute_rw() > does IO on pages of the scatterlist passed to it? Yes. > Where are they getting > dropped and what guarantees that IO is complete by that point? The exact place depens on the exact taaraget frontend of which we have a few. But it happens from the end_io callback that is triggered through a call to target_complete_cmd. > The reason I'm asking is that here you have an ITER_BVEC possibly fed to > __blkdev_direct_IO_async(), with its > if (iov_iter_is_bvec(iter)) { > /* > * Users don't rely on the iterator being in any particular > * state for async I/O returning -EIOCBQUEUED, hence we can > * avoid expensive iov_iter_advance(). Bypass > * bio_iov_iter_get_pages() and set the bvec directly. > */ > bio_iov_bvec_set(bio, iter); > which does *not* grab the page referneces. Sure, bio_release_pages() knows > to leave those alone and doesn't drop anything. However, what is the > mechanism preventing the pages getting freed before the IO completion > in this case? The contract that callers of bvec iters need to hold their own references as without that doing I/O do them would be unsafe. It they did not hold references the pages could go away before even calling bio_iov_iter_get_pages (or this open coded bio_iov_bvec_set).