Re: [linux-next:master] [mm] 5df397dec7: will-it-scale.per_thread_ops -53.3% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> On Tue, Dec 6, 2022 at 10:41 AM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> Let me think about this a while, but I think I'll have a patch for you
>> to test once I've dealt with a couple more pull requests.
>
> So here's a trial balloon for you to try if you can see if this mostly
> fixes the regression..
>
> It still limits batching (because unlike the full "gather pages until
> you have to flush", this is all batched under the page table lock. But
> it limits it a bit less, in that it will use a second active batch if
> it only used the initial on-stack one (which is called "local", which
> is not a great name in this context, but whatever).
>
> This _should_ mean that that benchmark will now batch ~512 pages
> instead of just 8.
>
> Which should be pretty much what it effectively used to do before too,
> because the dirty shared page case has always caused that
> "force_flush" thing, so it will have always stopped to flush every
> page directory.
>
> (But we still have that extra rmap flushing limit because there could
> have been _previous_ buffered page pointers that weren't dirty shared
> pages, and we don't want to have to deal with that pain, and might
> have to exit early in order to avoid it)
>
> I can imagine cleaner ways to do this, but they would involve having
> to remember which batch we started having dirty pages in, which is
> more bookkeeping pain than I really think it's worth.
>
> Does this fix the regression?

I have tested the patch, it does fix the regression, the test result is
as follows,

5df397dec7c4c08c 7cc8f9c7146a5c2dad6e71653c4 7763ba2bb16804313aa52bc78ae 
---------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \  
   2256919 ±  5%    +114.2%    4833919 ±  2%    +116.6%    4889199        will-it-scale.16.threads
      8.17 ±  6%      -8.2        0.00            -8.2        0.00        perf-profile.calltrace.cycles-pp.native_flush_tlb_one_user.flush_tlb_func.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function

Where 5df397dec7c4c08c is first bad commit, 7cc8f9c7146a5c2dad6e71653c4
is its parent commit, and 7763ba2bb16804313aa52bc78ae is the fix
commit.  The benchmark score recovered and CPU cycles for tlb flushing
recovered too.

Best Regards,
Huang, Ying





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux