On Thu, Aug 25, 2022 at 2:32 AM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote: > > On Wed, Aug 24, 2022 at 10:16:10PM +0900, Kim Suwan wrote: > > Hi Stefan, > > > > On Wed, Aug 24, 2022 at 5:56 AM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote: > > > > > > On Tue, Aug 23, 2022 at 11:50:05PM +0900, Suwan Kim wrote: > > > > @@ -409,6 +409,8 @@ static bool virtblk_add_req_batch(struct virtio_blk_vq *vq, > > > > virtblk_unmap_data(req, vbr); > > > > virtblk_cleanup_cmd(req); > > > > rq_list_add(requeue_list, req); > > > > + } else { > > > > + blk_mq_start_request(req); > > > > } > > > > > > The device may see new requests as soon as virtblk_add_req() is called > > > above. Therefore the device may complete the request before > > > blk_mq_start_request() is called. > > > > > > rq->io_start_time_ns = ktime_get_ns() will be after the request was > > > actually submitted. > > > > > > I think blk_mq_start_request() needs to be called before > > > virtblk_add_req(). > > > > > > > But if blk_mq_start_request() is called before virtblk_add_req() > > and virtblk_add_req() fails, it can trigger WARN_ON_ONCE() at > > virtio_queue_rq(). > > > > With regard to the race condition between virtblk_add_req() and > > completion, I think that the race condition can not happen because > > virtblk_add_req() holds vq lock with irq saving and completion side > > (virtblk_done, virtblk_poll) need to acquire the vq lock also. > > Moreover, virtblk_done() is irq context so I think it can not be > > executed until virtblk_add_req() releases the lock. > > I agree there is no race condition regarding the ordering of > blk_mq_start_request() and request completion. The spinlock prevents > that and I wasn't concerned about that part. > > The issue is that the timestamp will be garbage. If we capture the > timestamp during/after the request is executing, then the collected > statistics will be wrong. > > Can you look for another solution that doesn't break the timestamp? > > FWIW I see that the rq->state state machine allows returning to the idle > state once the request has been started: __blk_mq_requeue_request(). I considered blk_mq_requeue_request() to handle error cases but I didn't use it because I think it can make the error path request processing slower than requeuing an error request to plug list again. But there doesn't seem to be any other option that doesn't break the timestamp. As you said, I will use __blk_mq_requeue_request() and send new patch soon. To Alexandre, I will share new diff soon. Could you please test one more time? Regards, Suwan Kim