Yafang Shao <laoar.shao@xxxxxxxxx> writes: > On Thu, Jul 11, 2024 at 4:20 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: >> >> Yafang Shao <laoar.shao@xxxxxxxxx> writes: >> >> > On Thu, Jul 11, 2024 at 2:44 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: >> >> >> >> Yafang Shao <laoar.shao@xxxxxxxxx> writes: >> >> >> >> > On Wed, Jul 10, 2024 at 10:51 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote: >> >> >> >> >> >> Yafang Shao <laoar.shao@xxxxxxxxx> writes: >> >> >> >> >> >> > The configuration parameter PCP_BATCH_SCALE_MAX poses challenges for >> >> >> > quickly experimenting with specific workloads in a production environment, >> >> >> > particularly when monitoring latency spikes caused by contention on the >> >> >> > zone->lock. To address this, a new sysctl parameter vm.pcp_batch_scale_max >> >> >> > is introduced as a more practical alternative. >> >> >> >> >> >> In general, I'm neutral to the change. I can understand that kernel >> >> >> configuration isn't as flexible as sysctl knob. But, sysctl knob is ABI >> >> >> too. >> >> >> >> >> >> > To ultimately mitigate the zone->lock contention issue, several suggestions >> >> >> > have been proposed. One approach involves dividing large zones into multi >> >> >> > smaller zones, as suggested by Matthew[0], while another entails splitting >> >> >> > the zone->lock using a mechanism similar to memory arenas and shifting away >> >> >> > from relying solely on zone_id to identify the range of free lists a >> >> >> > particular page belongs to[1]. However, implementing these solutions is >> >> >> > likely to necessitate a more extended development effort. >> >> >> >> >> >> Per my understanding, the change will hurt instead of improve zone->lock >> >> >> contention. Instead, it will reduce page allocation/freeing latency. >> >> > >> >> > I'm quite perplexed by your recent comment. You introduced a >> >> > configuration that has proven to be difficult to use, and you have >> >> > been resistant to suggestions for modifying it to a more user-friendly >> >> > and practical tuning approach. May I inquire about the rationale >> >> > behind introducing this configuration in the beginning? >> >> >> >> Sorry, I don't understand your words. Do you need me to explain what is >> >> "neutral"? >> > >> > No, thanks. >> > After consulting with ChatGPT, I received a clear and comprehensive >> > explanation of what "neutral" means, providing me with a better >> > understanding of the concept. >> > >> > So, can you explain why you introduced it as a config in the beginning ? >> >> I think that I have explained it in the commit log of commit >> 52166607ecc9 ("mm: restrict the pcp batch scale factor to avoid too long >> latency"). Which introduces the config. > > What specifically are your expectations for how users should utilize > this config in real production workload? > >> >> Sysctl knob is ABI, which needs to be maintained forever. Can you >> explain why you need it? Why cannot you use a fixed value after initial >> experiments. > > Given the extensive scale of our production environment, with hundreds > of thousands of servers, it begs the question: how do you propose we > efficiently manage the various workloads that remain unaffected by the > sysctl change implemented on just a few thousand servers? Is it > feasible to expect us to recompile and release a new kernel for every > instance where the default value falls short? Surely, there must be > more practical and efficient approaches we can explore together to > ensure optimal performance across all workloads. > > When making improvements or modifications, kindly ensure that they are > not solely confined to a test or lab environment. It's vital to also > consider the needs and requirements of our actual users, along with > the diverse workloads they encounter in their daily operations. Have you found that your different systems requires different CONFIG_PCP_BATCH_SCALE_MAX value already? If no, I think that it's better for you to keep this patch in your downstream kernel for now. When you find that it is a common requirement, we can evaluate whether to make it a sysctl knob. -- Best Regards, Huang, Ying