Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/29/20 9:05 AM, Jens Axboe wrote:
> On 10/29/20 9:03 AM, Matthew Wilcox wrote:
>> On Thu, Oct 29, 2020 at 09:02:31AM -0600, Jens Axboe wrote:
>>> On 10/29/20 8:57 AM, Matthew Wilcox wrote:
>>>> On Thu, Oct 29, 2020 at 07:57:34AM -0600, Jens Axboe wrote:
>>>>> On 10/28/20 4:26 PM, Jens Axboe wrote:
>>>>>> I did see some wins when I tested this. I'll try and run some testing
>>>>>> tomorrow and report back. If there's something specifically you want to
>>>>>> see tested, let me know.
>>>>>
>>>>> I did some testing, unfortunately it's _very_ hard to produce somewhat
>>>>> consistent and good numbers as it quickly becomes a game of kswapd.
>>>>> Here's a basic case of 4 threads doing 32k random reads:
>>>>>
>>>>>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
>>>>>     462 root      20   0       0      0      0 R  65.5   0.0   0:08.02 kswapd0
>>>>>    2287 axboe     20   0 1303448   2176   1072 R  46.6   0.0   0:05.35 fio
>>>>>    2289 axboe     20   0 1303456   2196   1092 D  46.6   0.0   0:05.34 fio
>>>>>    2290 axboe     20   0 1303460   2216   1112 D  46.6   0.0   0:05.37 fio
>>>>>    2288 axboe     20   0 1303452   2224   1120 R  45.9   0.0   0:05.33 fio
>>>>>
>>>>> Sad face... Unfortunately once kswapd kicks in, performance also
>>>>> plummets. This box only has 32G of ram, and you can fill that in less
>>>>> than 10 seconds doing buffered reads like that.
>>>>>
>>>>> I ran 4k and 32k testing, and using 1 and 4 threads. But given the above
>>>>> sadness, it quickly ends up looking the same for me.
>>>>
>>>> What if your workload actually fits in memory?  That would seem to be
>>>> the situation where Kent's patches would make a difference.
>>>
>>> That was my point, if I do multi-page reads then memory is filled in
>>> seconds, which makes it pretty hard to provide any accurate numbers. I
>>> don't have anything slow in this test box, I'll see if I can find
>>> something to stick in it.
>>
>> I meant re-reading files which fit in memory, so you take ten seconds
>> to fill the page cache, then read from the same files over and over.
> 
> That I can certainly try.

Reading a 16G file randomly for 10 seconds, using 1 or 4 threads and
either 4k or 32k reads:

test			5.10-rc1		5.10-rc1+kent
-------------------------------------------------------------
1 thread, 4k		976K			1030K	(+5.5%)
4 threads, 4k		3462K			3453K	(-0.3%)
1 thread, 32k		299K			322K	(+7.7%)
4 threads, 32k		769K			785K	(+2.0%)

-- 
Jens Axboe





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux