RE: [PATCH 2/5] ioengines: don't record issue_time if ioengines already do it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Nick Neumann [mailto:nick@xxxxxxxxxxxxxxxx]
> Sent: Wednesday, June 15, 2022 11:27 AM
> To: Vincent Fu <vincent.fu@xxxxxxxxxxx>
> Cc: axboe@xxxxxxxxx; fio@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH 2/5] ioengines: don't record issue_time if ioengines
> already do it
> 
> On Tue, Jun 14, 2022 at 10:59 AM Vincent Fu <vincent.fu@xxxxxxxxxxx>
> wrote:
> >
> > io_uring, io_uring_cmd, and libaio record issue_time inside the
> ioengine
> > code when their commit functions are called. So we don't need to
> record
> > issue_time again for these ioengines in td_io_queue.
> >
> > If we do fill issue_time twice, then mean(slat) + mean(clat) !=
> > mean(lat):
> 
> I'm a little bit confused though why not recording issue_time again in
> td_io_queue is considered the right fix. By having some ioengines that
> record issue_time in their commit functions, and others that do not
> (and thus have it recorded in td_io_queue), comparing clat between two
> different engines isn't an apples-to-apples comparison. An ioengine
> that does not record issue_time in its commit gets a "discount" of
> sorts in its measured clat, since its issue_time is not recorded until
> later. Does that make sense, or am I missing something? (Has been a
> few months since I've been in the code in detail, so that's possible.)

You make a good point and I'm glad you brought this up for discussion.

We currently have three latency pathways for async ioengines:

(1) async w/o commit: record issue_time in td_io_queue after queue
(2) async w/commit that *does not* record issue_time: records issue_time in
td_io_queue after queue (slat not reported)
(3) async w/commit that *does* record issue_time: issue_time recorded in
both td_io_queue and commit

I think the right long-term solution is to make the ioengines in (2) conform to
(1) or (3). Then we can have fair comparisons. But what 'queue' and 'commit'
mean for them doesn't completely match what the two mean for libaio and
io_uring.

For libaio and io_uring 'queue' prepares an IO and 'commit' actually sends it
to the OS. So it's appropriate to record issue_time after commit. But, for
example, take a look at the nfs ioengine's commit:

https://github.com/axboe/fio/blob/master/engines/nfs.c#L285

nfs isn't doing what libaio and io_uring are doing with commit.

There are four ioengines that fall into category (2): ime_aio, nfs, null, and
cpp_null. It would take some time to test the changes but probably it would be
straightforward to make nfs conform to (1) and {null, cpp_null} conform to (3)
but I have no idea about ime_aio.

I think the current patch is an improvement over what we currently have.  Plus
it really bothers me that mean(slat) + mean(clat) != mean(lat) and this
resolves that issue.  If someone comes along to fix everything up then we can
get rid of FIO_ASYNCIO_SETS_ISSUE_TIME and instead of td_io_queue checking for
that flag before setting issue_time it can instead check for the presence of a
commit hook.





[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux