Hello, On (09/16/13 09:02), Minchan Kim wrote: > Hello Sergey, > > Sorry for really slow response. I was really busy by internal works > and Thanks for pointing the BUG, Dan, Jerome and Sergey. > I read your threads roughly so I may miss something. If so, sorry > for that. Anyway I will put my opinion. > > On Wed, Sep 11, 2013 at 02:12:50AM +0300, Sergey Senozhatsky wrote: > > Dan Carpenter noted that handle_pending_slot_free() is racy with > > zram_reset_device(). Take write init_lock in zram_slot_free(), thus > > Right but "init_lock" is what I really want to remove. > Yes. It's just read-side lock so most of time it doesn't hurt us but it > makes code very complicated and deadlock prone so I'd like to replace > it with RCU. Yeah, It's off topic but just let me put my opinion in > future direction. > > Abought the bug, how about moving flush_work below down_write(init_lock)? > zram_make_request is already closed by init_lock and we have a rule about > lock ordering as following so I don't see any problem. > > init_lock > zram->lock > > > preventing any concurrent zram_slot_free(), zram_bvec_rw() or > > zram_reset_device(). This also allows to safely check zram->init_done > > in handle_pending_slot_free(). > > > > Initial intention was to minimze number of handle_pending_slot_free() > > call from zram_bvec_rw(), which were slowing down READ requests due to > > slot_free_lock spin lock. Jerome Marchand suggested to remove > > handle_pending_slot_free() from zram_bvec_rw(). > > > > Link: https://lkml.org/lkml/2013/9/9/172 > > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx> > > > > --- > > > > drivers/staging/zram/zram_drv.c | 13 +++++-------- > > 1 file changed, 5 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c > > index 91d94b5..7a2d4de 100644 > > --- a/drivers/staging/zram/zram_drv.c > > +++ b/drivers/staging/zram/zram_drv.c > > @@ -521,7 +521,8 @@ static void handle_pending_slot_free(struct zram *zram) > > while (zram->slot_free_rq) { > > free_rq = zram->slot_free_rq; > > zram->slot_free_rq = free_rq->next; > > - zram_free_page(zram, free_rq->index); > > + if (zram->init_done) > > + zram_free_page(zram, free_rq->index); > > kfree(free_rq); > > } > > spin_unlock(&zram->slot_free_lock); > > @@ -534,16 +535,13 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, > > > > if (rw == READ) { > > down_read(&zram->lock); > > - handle_pending_slot_free(zram); > > Read side is okay but actually I have a nitpick. > If someone poll a block in zram-swap device, he would see a block > has zero value suddenly although there was no I/O.(I don't want to argue > it's sane user or not, anyway) it never happens on real block device and > it never happens on zram-block device. Only it can happen zram-swap device. > And such behavior was there since we introduced swap_slot_free_notify. > (off-topic: I'd like to remove it because it makes tight coupling between > zram and swap and obviously, it was layering violation function) > so now, I don't have strong objection. > > The idea is to remove swap_slot_free_notify is to use frontswap when > user want to use zram as swap so zram can be notified when the block > lose the owner but still we should solve the mutex problem in notify > handler. > > > > ret = zram_bvec_read(zram, bvec, index, offset, bio); > > up_read(&zram->lock); > > } else { > > down_write(&zram->lock); > > - handle_pending_slot_free(zram); > > Why did you remove this in write-side? > We can't expect when the work will trigger. It means the work could remove > valid block under the us. > not sure I understand how. zram_slot_free() takes down_write(&zram->init_lock) and zram_make_request() takes down_read(&zram->init_lock), thus zram_slot_free() can not concurrently work with any RW requests. RW requests are under read() lock and zram_slot_free() is under write() lock. > > ret = zram_bvec_write(zram, bvec, index, offset); > > up_write(&zram->lock); > > } > > - > > return ret; > > } > > > > @@ -750,12 +748,11 @@ error: > > > > static void zram_slot_free(struct work_struct *work) > > { > > - struct zram *zram; > > + struct zram *zram = container_of(work, struct zram, free_work); > > > > - zram = container_of(work, struct zram, free_work); > > - down_write(&zram->lock); > > + down_write(&zram->init_lock); > > I don't like this. > Primary problem is we should handle it as atomic so that we should use > spinlock instead of mutex. Yeah, /me kicks his ass. From the beginning, > I should solve this problem as that way. > > The simple solution popped from my mind is that > > > diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c > index 91d94b5..b23bf0e 100644 > --- a/drivers/staging/zram/zram_drv.c > +++ b/drivers/staging/zram/zram_drv.c > @@ -534,11 +534,14 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, > > if (rw == READ) { > down_read(&zram->lock); > - handle_pending_slot_free(zram); > ret = zram_bvec_read(zram, bvec, index, offset, bio); > up_read(&zram->lock); > } else { > down_write(&zram->lock); > + /* > + * We should free pending slot. Otherwise it would > + * free valid blocks under the us. > + */ > handle_pending_slot_free(zram); > ret = zram_bvec_write(zram, bvec, index, offset); > up_write(&zram->lock); > @@ -552,7 +555,6 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity) > size_t index; > struct zram_meta *meta; > > - flush_work(&zram->free_work); > > down_write(&zram->init_lock); > if (!zram->init_done) { > @@ -560,6 +562,7 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity) > return; > } > > + flush_work(&zram->free_work); > meta = zram->meta; > zram->init_done = 0; this one looks ok to me. -ss > But more ideal way I am thinking now is > > 1) replace init_lock with RCU lock > 2) introduce new meta atmoic lock instead of zram->mutex, which is very coarse-grained. > 3) use atmoic lock in notify handler. > > -- > Kind regards, > Minchan Kim > _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel