Re: [PATCH RFC 1/3] block/mq-deadline: Revert "block/mq-deadline: Fix the tag reservation code"

Zhiguo Niu <niuzhiguo84@xxxxxxxxx> · Wed, 11 Dec 2024 11:03:45 +0800



Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> 于2024年12月11日周三 10:58写道：
>
> Hi,
>
> 在 2024/12/11 10:38, Zhiguo Niu 写道:
> > Bart Van Assche <bvanassche@xxxxxxx> 于2024年12月11日周三 04:33写道：
> >>
> >> On 12/9/24 10:22 PM, Yu Kuai wrote:
> >>> First of all, are we in the agreement that it's not acceptable to
> >>> sacrifice performance in the default scenario just to make sure
> >>> functional correctness if async_depth is set to 1?
> >>
> >> How much does this affect performance? If this affects performance
> >> significantly I agree that this needs to be fixed.
> >>
> >>> If so, following are the options that I can think of to fix this:
> >>>
> >>> 1) make async_depth read-only, if 75% tags will hurt performance in some
> >>> cases, user can increase nr_requests to prevent it.
> >>> 2) refactor elevator sysfs api, remove eq->sysfs_lock and replace it
> >>> with q->sysfs_lock, so deadline_async_depth_store() will be protected
> >>> against changing hctxs, and min_shallow_depth can be updated here.
> >>> 3) other options?
> >>
> >> Another option is to remove the ability to configure async_depth. If it
> >> is too much trouble to get the implementation right without causing
> >> regressions for existing workloads, one possibility is to remove support
> >> for restricting the number of asynchronous requests in flight.
> > Hi Bart,
> > I think it is very useful to restrict asynchronous requests when IO
> > loading is very heavy by aysnc_depth.
> > the following is my androidbench experiment in android device(sched_tag=128):
> > 1. setting heavy IO
> > while true; do fio -directory=/data -direct=0 -rw=write -bs=64M
> > -size=1G -numjobs=5 -name=fiotest
> > 2. run androidbench  and results：
> >                  orignial async_depth
> > async_depth=nr_requests*3/4      delta
> > seq read             33.176                                216.49
> >                        183.314
> > seq write             28.57                                  62.152
> >                           33.582
> > radom read         1.518                                  1.648
> >                          0.13
> > radom write         3.546                                  4.27
> >                            0.724
> > and our customer also feedback there is optimization when they test
> > APP cold start and benchmark after tunning async_depth.
>
> So do you guys writing async_depth? Looks like you're using
> nr_requests*3/4. If this is the case, the above option 1) is still
> working for you guys. However, in this test, I think the lower
> async_depth is, the better result you'll get.
Hi Kuai,
yes, we modify async_depth to nr_reqeusts*3/4 by sysfs.
thanks!
>
> Thanks,
> Kuai
>
>
> > thanks！
> >>
> >> Thanks,
> >>
> >> Bart.
> >>
> > .
> >
>
>