On 03/01/2018 9:15 PM, tang.junhui@xxxxxxxxxx wrote: > From: Tang Junhui <tang.junhui@xxxxxxxxxx> > > Hello Coly, > > Thanks for this serials. > >> struct delayed_work writeback_rate_update in struct cache_dev is a delayed >> worker to call function update_writeback_rate() in period (the interval is >> defined by dc->writeback_rate_update_seconds). >> >> When a metadate I/O error happens on cache device, bcache error handling >> routine bch_cache_set_error() will call bch_cache_set_unregister() to >> retire whole cache set. On the unregister code path, cached_dev_free() >> calls cancel_delayed_work_sync(&dc->writeback_rate_update) to stop this >> delayed work. >> >> dc->writeback_rate_update is a special delayed work from others in bcache. >> In its routine update_writeback_rate(), this delayed work is re-armed >> after a piece of time. That means when cancel_delayed_work_sync() returns, >> this delayed work can still be executed after several seconds defined by >> dc->writeback_rate_update_seconds. >> >> The problem is, after cancel_delayed_work_sync() returns, the cache set >> unregister code path will eventually release memory of struct cache set. >> Then the delayed work is scheduled to run, and inside its routine >> update_writeback_rate() that already released cache set NULL pointer will >> be accessed. Now a NULL pointer deference panic is triggered. >> > As I known that, after calling cancel_delayed_work_sync(), if the work queue > is running, cancel_delayed_work_sync() would return only AFTER the work queue > runs over, else if the work queue does not run yet, cancel_delayed_work_sync() > will cancel the work queue imediately, so I think it is safe in > update_writeback_rate() to access the cache set resource. Point me out if I > am wrong. > Hi Junhui, dc->writeback_rate_update is a special delayed worker, it re-arms itself to run after several seconds by, >> schedule_delayed_work(&dc->writeback_rate_update, >> dc->writeback_rate_update_seconds * HZ); I check the workqueue code, it seems cancel_delayed_work_sync() does not prevent a delayed work to re-arm itself inside worker routine. And in my test, around 5 seconds after cancel_delayed_work_sync() called, a NULL pointer difference oops happens. Cache set memory is freed within 2 seconds after __cache_set_unregister() called, so inside __update_writeback_rate() struct cache_set is referenced and causes the NULL pointer deference bug. There are several delayed work in bcache code, but only dc->writeback_rate_update is special to re-arm itself. Coly Li >> In order to avoid the above problem, this patch checks cache set flags in >> delayed work routine update_writeback_rate(). If flag CACHE_SET_STOPPING >> is set, this routine will quit without re-arm the delayed work. Then the >> NULL pointer deference panic won't happen after cache set is released. >> >> Signed-off-by: Coly Li <colyli@xxxxxxx> >> --- >> drivers/md/bcache/writeback.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c >> index 0789a9e18337..745d9b2a326f 100644 >> --- a/drivers/md/bcache/writeback.c >> +++ b/drivers/md/bcache/writeback.c >> @@ -91,6 +91,11 @@ static void update_writeback_rate(struct work_struct *work) >> struct cached_dev *dc = container_of(to_delayed_work(work), >> struct cached_dev, >> writeback_rate_update); >> + struct cache_set *c = dc->disk.c; >> + >> + /* quit directly if cache set is stopping */ >> + if (test_bit(CACHE_SET_STOPPING, &c->flags)) >> + return; >> >> down_read(&dc->writeback_lock); >> >> @@ -100,6 +105,10 @@ static void update_writeback_rate(struct work_struct *work) >> >> up_read(&dc->writeback_lock); >> >> + /* do not schedule delayed work if cache set is stopping */ >> + if (test_bit(CACHE_SET_STOPPING, &c->flags)) >> + return; >> + >> schedule_delayed_work(&dc->writeback_rate_update, >> dc->writeback_rate_update_seconds * HZ); >> } >> -- >> 2.15.1 > > Thanks, > Tang Junhui > -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html