Re: [PATCH 3/3] mm/page_alloc: Introduce a new sysctl knob vm.pcp_batch_scale_max

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yafang Shao <laoar.shao@xxxxxxxxx> writes:

> On Thu, Jul 11, 2024 at 4:20 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote:
>>
>> Yafang Shao <laoar.shao@xxxxxxxxx> writes:
>>
>> > On Thu, Jul 11, 2024 at 2:44 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote:
>> >>
>> >> Yafang Shao <laoar.shao@xxxxxxxxx> writes:
>> >>
>> >> > On Wed, Jul 10, 2024 at 10:51 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote:
>> >> >>
>> >> >> Yafang Shao <laoar.shao@xxxxxxxxx> writes:
>> >> >>
>> >> >> > The configuration parameter PCP_BATCH_SCALE_MAX poses challenges for
>> >> >> > quickly experimenting with specific workloads in a production environment,
>> >> >> > particularly when monitoring latency spikes caused by contention on the
>> >> >> > zone->lock. To address this, a new sysctl parameter vm.pcp_batch_scale_max
>> >> >> > is introduced as a more practical alternative.
>> >> >>
>> >> >> In general, I'm neutral to the change.  I can understand that kernel
>> >> >> configuration isn't as flexible as sysctl knob.  But, sysctl knob is ABI
>> >> >> too.
>> >> >>
>> >> >> > To ultimately mitigate the zone->lock contention issue, several suggestions
>> >> >> > have been proposed. One approach involves dividing large zones into multi
>> >> >> > smaller zones, as suggested by Matthew[0], while another entails splitting
>> >> >> > the zone->lock using a mechanism similar to memory arenas and shifting away
>> >> >> > from relying solely on zone_id to identify the range of free lists a
>> >> >> > particular page belongs to[1]. However, implementing these solutions is
>> >> >> > likely to necessitate a more extended development effort.
>> >> >>
>> >> >> Per my understanding, the change will hurt instead of improve zone->lock
>> >> >> contention.  Instead, it will reduce page allocation/freeing latency.
>> >> >
>> >> > I'm quite perplexed by your recent comment. You introduced a
>> >> > configuration that has proven to be difficult to use, and you have
>> >> > been resistant to suggestions for modifying it to a more user-friendly
>> >> > and practical tuning approach. May I inquire about the rationale
>> >> > behind introducing this configuration in the beginning?
>> >>
>> >> Sorry, I don't understand your words.  Do you need me to explain what is
>> >> "neutral"?
>> >
>> > No, thanks.
>> > After consulting with ChatGPT, I received a clear and comprehensive
>> > explanation of what "neutral" means, providing me with a better
>> > understanding of the concept.
>> >
>> > So, can you explain why you introduced it as a config in the beginning ?
>>
>> I think that I have explained it in the commit log of commit
>> 52166607ecc9 ("mm: restrict the pcp batch scale factor to avoid too long
>> latency").  Which introduces the config.
>
> What specifically are your expectations for how users should utilize
> this config in real production workload?
>
>>
>> Sysctl knob is ABI, which needs to be maintained forever.  Can you
>> explain why you need it?  Why cannot you use a fixed value after initial
>> experiments.
>
> Given the extensive scale of our production environment, with hundreds
> of thousands of servers, it begs the question: how do you propose we
> efficiently manage the various workloads that remain unaffected by the
> sysctl change implemented on just a few thousand servers? Is it
> feasible to expect us to recompile and release a new kernel for every
> instance where the default value falls short? Surely, there must be
> more practical and efficient approaches we can explore together to
> ensure optimal performance across all workloads.
>
> When making improvements or modifications, kindly ensure that they are
> not solely confined to a test or lab environment. It's vital to also
> consider the needs and requirements of our actual users, along with
> the diverse workloads they encounter in their daily operations.

Have you found that your different systems requires different
CONFIG_PCP_BATCH_SCALE_MAX value already?  If no, I think that it's
better for you to keep this patch in your downstream kernel for now.
When you find that it is a common requirement, we can evaluate whether
to make it a sysctl knob.

--
Best Regards,
Huang, Ying





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux