On 12/30/24 4:02 PM, Gabriel Krisman Bertazi wrote: > Jens Axboe <axboe@xxxxxxxxx> writes: > >> On 12/30/24 9:08 AM, Gabriel Krisman Bertazi wrote: >>> Jens Axboe <axboe@xxxxxxxxx> writes: >>> >>>> A previous commit mistakenly moved the clearing of the in-progress byte >>>> count into the section that's dependent on having a cached iovec or not, >>>> but it should be cleared for any IO. If not, then extra bytes may be >>>> added at IO completion time, causing potentially weird behavior like >>>> over-reporting the amount of IO done. >>> >>> Hi Jens, >>> >>> Sorry for the delay. I went completely offline during the christmas >>> week. >> >> No worries, sounds like a good plan! >> >>> Did this solve the sysbot report? I'm failing to understand how it can >>> happen. This could only be hit if the allocation returned a cached >>> object that doesn't have a free_iov, since any newly kmalloc'ed object >>> will have this field cleaned inside the io_rw_async_data_init callback. >>> But I don't understand where we can cache the rw object without having a >>> valid free_iov - it didn't seem possible to me before or now. >> >> Not sure I follow - you may never have a valid free_iov, it completely >> depends on whether or not the existing rw user needed to allocate an iov >> or not. > >> Hence it's indeed possible that there's a free_iov and the user >> doesn't need or use it, or the opposite of there not being one and the >> user then allocating one that persists. >> >> In any case, it's of course orthogonal to the issue here, which is that >> ->bytes_done must _always_ be initialized, it has no dependency on a >> free_iovec or not. Whenever someone gets an 'rw', it should be pristine >> in that sense. > > I see. In addition, I was actually confusing rw->free_iov_nr with > rw->bytes_done when writing my previous message. The first needs to > have a valid value if ->free_iov is valid. Thanks for the explanation > and making me review this code. Ah right, yes free_iov_nr would obviously need to be valid if free_iov is set. > The fix looks good to me now, obviously. Thanks for checking! -- Jens Axboe