On Wed, Jun 14, 2023 at 1:50 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > > On Wed, Jun 14, 2023 at 7:59 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > > On Tue, Jun 13, 2023 at 01:13:59PM -0700, Yosry Ahmed wrote: > > > On Mon, Jun 5, 2023 at 6:56 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > > > > > > > > On Fri, Jun 2, 2023 at 1:24 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > > > Sorry, I should have been more precise. > > > > > > > > > > I'm saying that using NR_CPUS pools, and replacing the hash with > > > > > smp_processor_id(), would accomplish your goal of pool concurrency. > > > > > But it would do so with a broadly-used, well-understood scaling > > > > > factor. We might not need a config option at all. > > > > > > > > > > The lock would still be there, but contention would be reduced fairly > > > > > optimally (barring preemption) for store concurrency at least. Not > > > > > fully eliminated due to frees and compaction, though, yes. > > > > > > I thought about this again and had some internal discussions, and I am > > > more unsure about it. Beyond the memory overhead, having too many > > > zpools might result in higher fragmentation within the zpools. For > > > zsmalloc, we do not compact across multiple zpools for example. > > > > > > We have been using a specific number of zpools in our production for > > > years now, we know it can be tuned to achieve performance gains. OTOH, > > > percpu zpools (or NR_CPUS pools) seems like too big of a hammer, > > > probably too many zpools in a lot of cases, and we wouldn't know how > > > many zpools actually fits our workloads. > > > > Is it the same number across your entire fleet and all workloads? > > Yes. > > > > > How large *is* the number in relation to CPUs? > > It differs based on the number of cpus on the machine. We use 32 > zpools on all machines. > > > > > > I see value in allowing the number of zpools to be directly > > > configurable (it can always be left as 1), and am worried that with > > > percpu we would be throwing away years of production testing for an > > > unknown. > > > > > > I am obviously biased, but I don't think this adds significant > > > complexity to the zswap code as-is (or as v2 is to be precise). > > > > I had typed out this long list of reasons why I hate this change, and > > then deleted it to suggest the per-cpu scaling factor. > > > > But to summarize my POV, I think a user-facing config option for this > > is completely inappropriate. There are no limits, no guidance, no sane > > default. And it's very selective about micro-optimizing this one lock > > when there are several locks and datastructures of the same scope in > > the swap path. This isn't a reasonable question to ask people building > > kernels. It's writing code through the Kconfig file. > > It's not just swap path, it's any contention that happens within the > zpool between its different operations (map, alloc, compaction, etc). > My thought was that if a user observes high contention in any of the > zpool operations, they can increase the number of zpools -- basically > this should be empirically decided. If unsure, the user can just leave > it as a single zpool. > > > > > Data structure scalability should be solved in code, not with config > > options. > > I agree, but until we have a more fundamental architectural solution, > having multiple zpools to address scalability is a win. We can remove > the config option later if needed. > > > > > My vote on the patch as proposed is NAK. > > I hear the argument about the config option not being ideal here, but > NR_CPUs is also not ideal. > > How about if we introduce it as a constant in the kernel? We have a > lot of other constants around the kernel that do not scale with the > machine size (e.g. SWAP_CLUSTER_MAX). We can start with 32, which is a > value that we have tested in our data centers for many years now and > know to work well. We can revisit later if needed. > > WDYT? I sent v3 [1] with the proposed constant instead of a config option, hopefully this is more acceptable. [1]https://lore.kernel.org/lkml/20230620194644.3142384-1-yosryahmed@xxxxxxxxxx/