Re: [axboe-block:xfs-async-dio] [fs] f9f8b03900: stress-ng.msg.ops_per_sec 29.3% improvement

Yin Fengwei <fengwei.yin@xxxxxxxxx> · Thu, 3 Aug 2023 09:47:57 +0800



On 8/3/23 01:31, Jens Axboe wrote:
> On 8/2/23 11:01?AM, Jens Axboe wrote:
>> On 8/2/23 10:38?AM, Jens Axboe wrote:
>>> On 8/2/23 7:52?AM, kernel test robot wrote:
>>>>
>>>> hi, Jens Axboe,
>>>>
>>>> though all results in below formal report are improvement, Fengwei (CCed)
>>>> checked on another Intel(R) Xeon(R) Gold 6336Y CPU @ 2.40GHz (Ice Lake)
>>>> (sorry, since this machine doesn't belong to our team, we cannot intergrate
>>>> the results in our report, only can heads-up you here), and found ~30%
>>>> stress-ng.msg.ops_per_sec regression.
>>>>
>>>> but by disable the TRACEPOINT, the regression will disappear.
>>>>
>>>> Fengwei also tried to remove following section from the patch:
>>>> @@ -351,7 +361,8 @@ enum rw_hint {
>>>>  	{ IOCB_WRITE,		"WRITE" }, \
>>>>  	{ IOCB_WAITQ,		"WAITQ" }, \
>>>>  	{ IOCB_NOIO,		"NOIO" }, \
>>>> -	{ IOCB_ALLOC_CACHE,	"ALLOC_CACHE" }
>>>> +	{ IOCB_ALLOC_CACHE,	"ALLOC_CACHE" }, \
>>>> +	{ IOCB_DIO_DEFER,	"DIO_DEFER" }
>>>>
>>>> the regression is also gone.
>>>>
>>>> Fengwei also mentioned to us that his understanding is this code update changed
>>>> the data section layout of the kernel. Otherwise, it's hard to explain the
>>>> regression/improvement this commit could bring.
>>>>
>>>> these information and below formal report FYI.
>>>
>>> Very funky. I ran this on my 256 thread box, and removing the
>>> IOCB_DIO_DEFER (which is now IOCB_CALLER_COMP) trace point definition, I
>>> get:
>>>
>>> stress-ng: metrc: [4148] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s
>>> stress-ng: metrc: [4148]                           (secs)    (secs)    (secs)   (real time) (usr+sys time)
>>> stress-ng: metrc: [4148] msg           1626997107     60.61    171.63   4003.65  26845470.19      389673.05
>>>
>>> and with it being the way it is in the branch:
>>>
>>> stress-ng: metrc: [3678] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s
>>> stress-ng: metrc: [3678]                           (secs)    (secs)    (secs)   (real time) (usr+sys time)
>>> stress-ng: metrc: [3678] msg           1287795248     61.25    140.26   3755.50  21025449.92      330563.24
>>>
>>> which is about a -21% bogo ops drop. Then I got a bit suspicious since
>>> the previous strings fit in 64 bytes, and now they don't, and I simply
>>> shortened the names so they still fit, as per below patch. With that,
>>> the regression there is reclaimed.
>>>
>>> That's as far as I've gotten yet, but I'm guessing we end up placing it
>>> differently, maybe now overlapping with data that is dirtied? I didn't
>>> profile it very much, just for an overview, and there's really nothing
>>> to observe there. The task and system is clearly more idle when the
>>> regression hits.
>>
>> Better variant here. I did confirm via System.map that layout
>> drastically changes when we use more than 64 bytes of string data. I'm
>> suspecting your test is sensitive to this and it may not mean more than
>> the fact that this test is a bit fragile like that, but let me know how
>> it works for you with the below.
> 
> Thinking about this just a bit more - it's clear that the bigger strings
> change your layour as well. For some cases, that ends up being a big
> win, for some it ends up being a loss. This is just the very nature of
> how the kernel is linked, and things like LTO deal with that
> specifically.
> 
> I don't think there's anything to do here, your test case is just
> sensitive to the layout changes caused. That doesn't mean they are
> either good or bad, it just means that changes happened and they
> happened to impact your test case in either direction.
Totally agreed. The layout changes can trigger different results on different
env (hardware, toolchain...). I got regression on my env and Oliver got
improvement on LKP env.


Regards
Yin, Fengwei