On Mon, Sep 13, 2021 at 3:43 PM Jens Axboe <axboe@xxxxxxxxx> wrote: > > Al, Linus, are you OK with this? I think we should get this in for 5.15. > I didn't resend the whole series, just a v2 of patch 1/3 to fix that bvec > vs iovec issue. Let me know if you want the while thing resent. So I'm ok with the iov_iter side, but the io_uring side seems still positively buggy, and very confused. It also messes with the state in bad ways and has internal knowledge. And some of it looks completely bogus. For example, I see state->count -= ret; rw->bytes_done += ret; and I go "that's BS". There's no way it's ok to start messing with the byte count inside the state like that. That just means that the state is now no longer the saved state, and it's some random garbage. I also think that the "bytes_done += ret" is a big hint there: any time you restore the iovec state, you should then forward it by "bytes_done". But that's not what the code does. Instead, it will now restore the iovec styate with the *wrong* number of bytes remaining, but will start from the beginning of the iovec. So I think the fs/io_uring.c use of this state buffer is completely wrong. What *may* be the right thing to do is to (a) not mess with state->count (b) when you restore the state you always use iov_iter_restore(iter, state, bytes_done); to actually restore the *correct* state. Because modifying the iovec save state like that cannot be right, and if it's right it's still too ugly and fragile for words. That save state should be treated as a snapshot, not as a random buffer that you can make arbitrary changes to. See what I'm saying? I'd like Al to take a look at the io_uring.c usage too, since this was just my reaction from looking at that diff a bit more. Linus