On Sun, 2020-12-20 at 21:20 +0000, Song Bao Hua (Barry Song) wrote: > > > -----Original Message----- > > From: Mike Galbraith [mailto:efault@xxxxxx] > > Sent: Sunday, December 20, 2020 8:48 PM > > To: Vitaly Wool <vitaly.wool@xxxxxxxxxxxx>; LKML > > <linux-kernel@xxxxxxxxxxxxxxx>; linux-mm <linux-mm@xxxxxxxxx> > > Cc: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>; Sebastian Andrzej > > Siewior <bigeasy@xxxxxxxxxxxxx>; Minchan Kim <minchan@xxxxxxxxxx>; NitinGupta > > <ngupta@xxxxxxxxxx> > > Subject: Re: [PATCH] zsmalloc: do not use bit_spin_lock > > > > On Sun, 2020-12-20 at 02:23 +0100, Mike Galbraith wrote: > > > On Sun, 2020-12-20 at 02:22 +0200, Vitaly Wool wrote: > > > > zsmalloc takes bit spinlock in its _map() callback and releases it > > > > only in unmap() which is unsafe and leads to zswap complaining > > > > about scheduling in atomic context. > > > > > > > > To fix that and to improve RT properties of zsmalloc, remove that > > > > bit spinlock completely and use a bit flag instead. > > > > > > It also does get_cpu_var() in map(), put_cpu_var() in unmap(). > > > > That aside, the bit spinlock removal seems to hold up to beating in RT. > > I stripped out the RT changes to replace the bit spinlocks, applied the > > still needed atm might_sleep() fix, and ltp zram and zswap test are > > running in a loop with no signs that it's a bad idea, so I hope that > > makes it in (minus the preempt disabled spin which I whacked), as it > > makes zsmalloc markedly more RT friendly. > > > > RT changes go from: > > 1 file changed, 79 insertions(+), 6 deletions(-) > > to: > > 1 file changed, 8 insertions(+), 3 deletions(-) > > > > Sorry, would you like to show the change for > "8 insertions(+), 3 deletions(-)"? Sure. --- mm/zsmalloc.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -57,6 +57,7 @@ #include <linux/wait.h> #include <linux/pagemap.h> #include <linux/fs.h> +#include <linux/local_lock.h> #define ZSPAGE_MAGIC 0x58 @@ -293,6 +294,7 @@ struct zspage { }; struct mapping_area { + local_lock_t lock; char *vm_buf; /* copy buffer for objects that span pages */ char *vm_addr; /* address of kmap_atomic()'ed pages */ enum zs_mapmode vm_mm; /* mapping mode */ @@ -455,7 +457,9 @@ MODULE_ALIAS("zpool-zsmalloc"); #endif /* CONFIG_ZPOOL */ /* per-cpu VM mapping areas for zspage accesses that cross page boundaries */ -static DEFINE_PER_CPU(struct mapping_area, zs_map_area); +static DEFINE_PER_CPU(struct mapping_area, zs_map_area) = { + .lock = INIT_LOCAL_LOCK(lock), +}; static bool is_zspage_isolated(struct zspage *zspage) { @@ -1276,7 +1280,8 @@ void *zs_map_object(struct zs_pool *pool class = pool->size_class[class_idx]; off = (class->size * obj_idx) & ~PAGE_MASK; - area = &get_cpu_var(zs_map_area); + local_lock(&zs_map_area.lock); + area = this_cpu_ptr(&zs_map_area); area->vm_mm = mm; if (off + class->size <= PAGE_SIZE) { /* this object is contained entirely within a page */ @@ -1330,7 +1335,7 @@ void zs_unmap_object(struct zs_pool *poo __zs_unmap_object(area, pages, off, class->size); } - put_cpu_var(zs_map_area); + local_unlock(&zs_map_area.lock); migrate_read_unlock(zspage); unpin_tag(handle); > BTW, your original patch looks not right as > crypto_wait_req(crypto_acomp_decompress()...) > can sleep too. > > [copy from your original patch with comment] > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1258,20 +1258,20 @@ static int zswap_frontswap_load(unsigned > > /* decompress */ > dlen = PAGE_SIZE; > + acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); > + mutex_lock(acomp_ctx->mutex); > src = zpool_map_handle(entry->pool->zpool, entry->handle, ZPOOL_MM_RO); > if (zpool_evictable(entry->pool->zpool)) > src += sizeof(struct zswap_header); > > - acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); > - mutex_lock(acomp_ctx->mutex); > sg_init_one(&input, src, entry->length); > sg_init_table(&output, 1); > sg_set_page(&output, page, PAGE_SIZE, 0); > acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen); > > /*!!!!!!!!!!!!!!!! > * here crypto could sleep > !!!!!!!!!!!!!!*/ Hohum, another one for my Bitmaster-9000 patch shredder. -Mike