On Wed, Dec 20, 2023 at 4:20 AM Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> wrote: > > On 2023/12/20 05:39, Yosry Ahmed wrote: > > On Tue, Dec 19, 2023 at 10:43 AM Nhat Pham <nphamcs@xxxxxxxxx> wrote: > >> > >> On Tue, Dec 19, 2023 at 5:29 AM Chris Li <chrisl@xxxxxxxxxx> wrote: > >>> > >>> Hi Chengming and Yosry, > >>> > >>> On Mon, Dec 18, 2023 at 3:50 AM Chengming Zhou > >>> <zhouchengming@xxxxxxxxxxxxx> wrote: > >>>> > >>>> Since the introduce of reusing the dstmem in the load path, it seems > >>>> confusing that we are now using acomp_ctx->dstmem and acomp_ctx->mutex > >>>> now for purposes other than what the naming suggests. > >>>> > >>>> Yosry suggested removing these two fields from acomp_ctx, and directly > >>>> using zswap_dstmem and zswap_mutex in both the load and store paths, > >>>> rename them, and add proper comments above their definitions that they > >>>> are for generic percpu buffering on the load and store paths. > >>>> > >>>> So this patch remove dstmem and mutex from acomp_ctx, and rename the > >>>> zswap_dstmem to zswap_buffer, using the percpu mutex and buffer on > >>>> the load and store paths. > >>> > >>> Sorry joining this discussion late. > >>> > >>> I get the rename of "dstmem" to "buffer". Because the buffer is used > >>> for both load and store as well. What I don't get is that, why do we > >>> move it out of the acomp_ctx struct. Now we have 3 per cpu entry: > >>> buffer, mutex and acomp_ctx. I think we should do the reverse, fold > >>> this three per cpu entry into one struct the acomp_ctx. Each per_cpu > >>> load() has a sequence of dance around the cpu id and disable preempt > >>> etc, while each of the struct member load is just a plan memory load. > >>> It seems to me it would be more optimal to combine this three per cpu > >>> entry into acomp_ctx. Just do the per cpu for the acomp_ctx once. > >> > >> I agree with Chris. From a practicality POV, what Chris says here > >> makes sense. From a semantic POV, this buffer is only used in > >> (de)compression contexts - be it in store, load, or writeback - so it > >> belonging to the orignal struct still makes sense to me. Why separate > >> it out, without any benefits. Just rename the old field buffer or > >> zswap_buffer and call it a day? It will be a smaller patch too! > >> > > > > My main concern is that the struct name is specific for the crypto > > acomp stuff, but that buffer and mutex are not. > > How about we keep it in the struct, but refactor the struct as follows: > > > > struct zswap_ctx { > > struct { > > struct crypto_acomp *acomp; > > struct acomp_req *req; > > struct crypto_wait wait; > > } acomp_ctx; > > u8 *dstmem; > > struct mutex *mutex; > > }; > > > > , and then rename zswap_pool.acomp_ctx to zswap_pool.ctx? > > I think there are two viewpoints here, both works ok to me. > > The first is from ownship or lifetime, these percpu mutex and dstmem > are shared between all pools, so no one pool own the mutex and dstmem, > nor does the percpu acomp_ctx in each pool. > > The second is from usage, these percpu mutex and dstmem are used by > the percpu acomp_ctx in each pool, so it's easy to use to put pointers > to them in the percpu acomp_ctx. > > Actually I think it's simpler to let the percpu acomp_ctx has its own > mutex and dstmem, which in fact are the necessary parts when it use > the acomp interfaces. > > This way, we could delete the percpu mutex and dstmem, and its hotplugs, > and not shared anymore between all pools. Maybe we would have many pools > at the same time in the future, like different compression algorithm or > different zpool for different memcg workloads. Who knows? :-) > > So how about this patch below? Just RFC. The general approach looks fine to me, although I still prefer we reorganize the struct as Chris and I discussed: rename crypto_acomp_ctx to a generic name, add a (anonymous) struct for the crypto_acomp stuff, rename dstmem to buffer or so. I think we can also make the mutex a static part of the struct, any advantage to dynamically allocating it? > > Subject: [PATCH] mm/zswap: make each acomp_ctx has its own mutex and dstmem > > Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx> > --- > include/linux/cpuhotplug.h | 1 - > mm/zswap.c | 86 ++++++++++++-------------------------- > 2 files changed, 26 insertions(+), 61 deletions(-) > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > index efc0c0b07efb..c3e06e21766a 100644 > --- a/include/linux/cpuhotplug.h > +++ b/include/linux/cpuhotplug.h > @@ -124,7 +124,6 @@ enum cpuhp_state { > CPUHP_ARM_BL_PREPARE, > CPUHP_TRACE_RB_PREPARE, > CPUHP_MM_ZS_PREPARE, > - CPUHP_MM_ZSWP_MEM_PREPARE, > CPUHP_MM_ZSWP_POOL_PREPARE, > CPUHP_KVM_PPC_BOOK3S_PREPARE, > CPUHP_ZCOMP_PREPARE, > diff --git a/mm/zswap.c b/mm/zswap.c > index 2c349fd88904..37301f1a80a9 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -694,63 +694,31 @@ static void zswap_alloc_shrinker(struct zswap_pool *pool) > /********************************* > * per-cpu code > **********************************/ > -static DEFINE_PER_CPU(u8 *, zswap_dstmem); > -/* > - * If users dynamically change the zpool type and compressor at runtime, i.e. > - * zswap is running, zswap can have more than one zpool on one cpu, but they > - * are sharing dtsmem. So we need this mutex to be per-cpu. > - */ > -static DEFINE_PER_CPU(struct mutex *, zswap_mutex); > - > -static int zswap_dstmem_prepare(unsigned int cpu) > -{ > - struct mutex *mutex; > - u8 *dst; > - > - dst = kmalloc_node(PAGE_SIZE, GFP_KERNEL, cpu_to_node(cpu)); > - if (!dst) > - return -ENOMEM; > - > - mutex = kmalloc_node(sizeof(*mutex), GFP_KERNEL, cpu_to_node(cpu)); > - if (!mutex) { > - kfree(dst); > - return -ENOMEM; > - } > - > - mutex_init(mutex); > - per_cpu(zswap_dstmem, cpu) = dst; > - per_cpu(zswap_mutex, cpu) = mutex; > - return 0; > -} > - > -static int zswap_dstmem_dead(unsigned int cpu) > -{ > - struct mutex *mutex; > - u8 *dst; > - > - mutex = per_cpu(zswap_mutex, cpu); > - kfree(mutex); > - per_cpu(zswap_mutex, cpu) = NULL; > - > - dst = per_cpu(zswap_dstmem, cpu); > - kfree(dst); > - per_cpu(zswap_dstmem, cpu) = NULL; > - > - return 0; > -} > - > static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node) > { > struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node); > struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu); > struct crypto_acomp *acomp; > struct acomp_req *req; > + int ret = 0; > + > + acomp_ctx->dstmem = kmalloc_node(PAGE_SIZE, GFP_KERNEL, cpu_to_node(cpu)); > + if (!acomp_ctx->dstmem) > + return -ENOMEM; > + > + acomp_ctx->mutex = kmalloc_node(sizeof(struct mutex), GFP_KERNEL, cpu_to_node(cpu)); > + if (!acomp_ctx->mutex) { > + ret = -ENOMEM; > + goto mutex_fail; > + } > + mutex_init(acomp_ctx->mutex); > > acomp = crypto_alloc_acomp_node(pool->tfm_name, 0, 0, cpu_to_node(cpu)); > if (IS_ERR(acomp)) { > pr_err("could not alloc crypto acomp %s : %ld\n", > pool->tfm_name, PTR_ERR(acomp)); > - return PTR_ERR(acomp); > + ret = PTR_ERR(acomp); > + goto acomp_fail; > } > acomp_ctx->acomp = acomp; > > @@ -758,8 +726,8 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node) > if (!req) { > pr_err("could not alloc crypto acomp_request %s\n", > pool->tfm_name); > - crypto_free_acomp(acomp_ctx->acomp); > - return -ENOMEM; > + ret = -ENOMEM; > + goto req_fail; > } > acomp_ctx->req = req; > > @@ -772,10 +740,15 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node) > acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG, > crypto_req_done, &acomp_ctx->wait); > > - acomp_ctx->mutex = per_cpu(zswap_mutex, cpu); > - acomp_ctx->dstmem = per_cpu(zswap_dstmem, cpu); > - > return 0; > +req_fail: > + crypto_free_acomp(acomp_ctx->acomp); > +acomp_fail: > + kfree(acomp_ctx->mutex); > +mutex_fail: > + kfree(acomp_ctx->dstmem); > + > + return ret; > } > > static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) > @@ -788,6 +761,8 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) > acomp_request_free(acomp_ctx->req); > if (!IS_ERR_OR_NULL(acomp_ctx->acomp)) > crypto_free_acomp(acomp_ctx->acomp); > + kfree(acomp_ctx->mutex); > + kfree(acomp_ctx->dstmem); > } > > return 0; > @@ -1901,13 +1876,6 @@ static int zswap_setup(void) > goto cache_fail; > } > > - ret = cpuhp_setup_state(CPUHP_MM_ZSWP_MEM_PREPARE, "mm/zswap:prepare", > - zswap_dstmem_prepare, zswap_dstmem_dead); > - if (ret) { > - pr_err("dstmem alloc failed\n"); > - goto dstmem_fail; > - } > - > ret = cpuhp_setup_state_multi(CPUHP_MM_ZSWP_POOL_PREPARE, > "mm/zswap_pool:prepare", > zswap_cpu_comp_prepare, > @@ -1939,8 +1907,6 @@ static int zswap_setup(void) > if (pool) > zswap_pool_destroy(pool); > hp_fail: > - cpuhp_remove_state(CPUHP_MM_ZSWP_MEM_PREPARE); > -dstmem_fail: > kmem_cache_destroy(zswap_entry_cache); > cache_fail: > /* if built-in, we aren't unloaded on failure; don't allow use */ > -- > 2.20.1 >