Re: [PATCH 2/5] ioengines: don't record issue_time if ioengines already do it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 15, 2022 at 12:06 PM Vincent Fu <vincent.fu@xxxxxxxxxxx> wrote:
>
> > -----Original Message-----
> > From: Nick Neumann [mailto:nick@xxxxxxxxxxxxxxxx]
> > Sent: Wednesday, June 15, 2022 11:27 AM
> > To: Vincent Fu <vincent.fu@xxxxxxxxxxx>
> > Cc: axboe@xxxxxxxxx; fio@xxxxxxxxxxxxxxx
> > Subject: Re: [PATCH 2/5] ioengines: don't record issue_time if ioengines
> > already do it
> >
> > On Tue, Jun 14, 2022 at 10:59 AM Vincent Fu <vincent.fu@xxxxxxxxxxx>
> > wrote:
> > >
> > > io_uring, io_uring_cmd, and libaio record issue_time inside the
> > ioengine
> > > code when their commit functions are called. So we don't need to
> > record
> > > issue_time again for these ioengines in td_io_queue.
> > >
> > > If we do fill issue_time twice, then mean(slat) + mean(clat) !=
> > > mean(lat):
> >
> > I'm a little bit confused though why not recording issue_time again in
> > td_io_queue is considered the right fix. By having some ioengines that
> > record issue_time in their commit functions, and others that do not
> > (and thus have it recorded in td_io_queue), comparing clat between two
> > different engines isn't an apples-to-apples comparison. An ioengine
> > that does not record issue_time in its commit gets a "discount" of
> > sorts in its measured clat, since its issue_time is not recorded until
> > later. Does that make sense, or am I missing something? (Has been a
> > few months since I've been in the code in detail, so that's possible.)
>
> You make a good point and I'm glad you brought this up for discussion.
>
> We currently have three latency pathways for async ioengines:
>
> (1) async w/o commit: record issue_time in td_io_queue after queue
> (2) async w/commit that *does not* record issue_time: records issue_time in
> td_io_queue after queue (slat not reported)
> (3) async w/commit that *does* record issue_time: issue_time recorded in
> both td_io_queue and commit
>
> I think the right long-term solution is to make the ioengines in (2) conform to
> (1) or (3). Then we can have fair comparisons. But what 'queue' and 'commit'
> mean for them doesn't completely match what the two mean for libaio and
> io_uring.
>
> For libaio and io_uring 'queue' prepares an IO and 'commit' actually sends it
> to the OS. So it's appropriate to record issue_time after commit. But, for
> example, take a look at the nfs ioengine's commit:
>
> https://github.com/axboe/fio/blob/master/engines/nfs.c#L285
>
> nfs isn't doing what libaio and io_uring are doing with commit.
>
> There are four ioengines that fall into category (2): ime_aio, nfs, null, and
> cpp_null. It would take some time to test the changes but probably it would be
> straightforward to make nfs conform to (1) and {null, cpp_null} conform to (3)
> but I have no idea about ime_aio.
>
> I think the current patch is an improvement over what we currently have.  Plus
> it really bothers me that mean(slat) + mean(clat) != mean(lat) and this
> resolves that issue.  If someone comes along to fix everything up then we can
> get rid of FIO_ASYNCIO_SETS_ISSUE_TIME and instead of td_io_queue checking for
> that flag before setting issue_time it can instead check for the presence of a
> commit hook.

Thanks for the detailed explanation. I actually read your blog post
about how fio measures latency too, and that was quite helpful as
well.

You've definitely convinced me that your change is an improvement, and
that there isn't necessarily a one-size-fits-all solution because the
different engines do different kinds of things in commit.

The fact that, before your change, mean(slat)+mean(clat) != mean(lat)
is very surprising. I (and I imagine many others) just assumed it was,
and it definitely should be, especially with libaio and io_uring,
which I imagine are the main async engines used to test raw ssd
performance. I'm glad you caught that the sum was not equal to the
total, and grateful you figured out why and fixed it. Thanks again.



[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux