On Tue, Sep 20, 2022 at 02:08:02PM +0200, Christoph Hellwig wrote:
-static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
+static struct bio *bio_map_get(struct request *rq, unsigned int nr_vecs,
gfp_t gfp_mask)
bio_map_get is a very confusing name.
So I chose that name because functionality is opposite of what we do
inside existing bio_map_put helper. In that way it is symmetric.
And I also still think this is
the wrong way to go. If plain slab allocations don't use proper
per-cpu caches we have a MM problem and need to talk to the slab
maintainers and not use the overkill bio_set here.
This series is not about using (or not using) bio-set. Attempt here has
been to use pre-mapped buffers (and bvec) that we got from io_uring.
+/* Prepare bio for passthrough IO given an existing bvec iter */
+int blk_rq_map_user_bvec(struct request *rq, struct iov_iter *iter)
I'm a little confused about the interface we're trying to present from
the block layer to the driver here.
blk_rq_map_user_iov really should be able to detect that it is called
on a bvec iter and just do the right thing rather than needing different
helpers.
I too explored that possibility, but found that it does not. It maps the
user-pages into bio either directly or by doing that copy (in certain odd
conditions) but does not know how to deal with existing bvec.
Reason, I guess, is no one felt the need to try passthrough for bvecs
before. It makes sense only in context of io_uring passthrough.
And it really felt cleaner to me write a new function rather than
overloading the blk_rq_map_user_iov with multiple if/else canals.
I tried that again after your comment, but it does not seem to produce
any good-looking code.
The other factor is - it seemed safe to go this way as I am more sure
that I will not break something else (using blk_rq_map_user_iov).
+ /*
+ * If the queue doesn't support SG gaps and adding this
+ * offset would create a gap, disallow it.
+ */
+ if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset))
+ goto out_err;
So now you limit the input that is accepted? That's not really how
iov_iters are used. We can either try to reshuffle the bvecs, or
just fall back to the copy data version as blk_rq_map_user_iov does
for 'weird' iters˙
Since I was writing a 'new' helper for passthrough only, I thought it
will not too bad to just bail out (rather than try to handle it using
copy) if we hit this queue_virt_boundary related situation.
To handle it the 'copy data' way we would need this -
585 else if (queue_virt_boundary(q))
586 copy = queue_virt_boundary(q) & iov_iter_gap_alignment(iter);
587
But iov_iter_gap_alignment does not work on bvec iters. Line #1274 below
1264 unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
1265 {
1266 unsigned long res = 0;
1267 unsigned long v = 0;
1268 size_t size = i->count;
1269 unsigned k;
1270
1271 if (iter_is_ubuf(i))
1272 return 0;
1273
1274 if (WARN_ON(!iter_is_iovec(i)))
1275 return ~0U;
Do you see a way to overcome this. Or maybe this can be revisted as we
are not missing a lot?
+
+ /* check full condition */
+ if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len)
+ goto out_err;
+
+ if (bytes + bv->bv_len <= nr_iter &&
+ bv->bv_offset + bv->bv_len <= PAGE_SIZE) {
+ nsegs++;
+ bytes += bv->bv_len;
+ } else
+ goto out_err;
Nit: This would read much better as:
if (bytes + bv->bv_len > nr_iter)
goto out_err;
if (bv->bv_offset + bv->bv_len > PAGE_SIZE)
goto out_err;
bytes += bv->bv_len;
nsegs++;
Indeed, cleaner. Thanks.