> 2023年8月24日 20:49,邹明哲 <mingzhe.zou@xxxxxxxxxxxx> 写道: > > From: Coly Li <colyli@xxxxxxx> > Date: 2023-08-23 01:49:32 > To: Mingzhe Zou <mingzhe.zou@xxxxxxxxxxxx> > Cc: bcache@xxxxxxxxxxxxxxxxxx,linux-bcache@xxxxxxxxxxxxxxx,zoumingzhe@xxxxxx > Subject: Re: [PATCH] bcache: fixup init dirty data errors>Hi Mingzhe, >> >> On Tue, Aug 22, 2023 at 06:19:58PM +0800, Mingzhe Zou wrote: >>> We found that after long run, the dirty_data of the bcache device >>> will have errors. This error cannot be eliminated unless re-register. >> >> Could you explain what the error was? >> > Hi, Coly > > We discovered dirty_data was inaccurate a long time ago. > When writeback thread flushes all dirty data, dirty_data via sysfs shows that > there are still several K to tens of M of dirty data. > > At that time, I thought it might be a calculation error at runtime, but after > reviewing the relevant code, no error was found. > > Last month, our online environment found that a certain device had more than > 200G of dirty_data. This brings us back to the question. > >>> >>> We also found that reattach after detach, this error can accumulate. >>> >> >> Could you elaborate how the error can accumulate? >> > I found that when dirty_data, error, detach and then re-attach can not > eliminate the error, the error will continue. > > In bch_cached_dev_attach(), after bch_sectors_dirty_init(), attach may still fail, > but dirty_data is not cleared when it fails > ``` > bch_sectors_dirty_init(&dc->disk); > > ret = bch_cached_dev_run(dc); > if (ret && (ret != -EBUSY)) { > up_write(&dc->writeback_lock); > /* > * bch_register_lock is held, bcache_device_stop() is not > * able to be directly called. The kthread and kworker > * created previously in bch_cached_dev_writeback_start() > * have to be stopped manually here. > */ > kthread_stop(dc->writeback_thread); > dc->writeback_thread = NULL; > cancel_writeback_rate_update_dwork(dc); > pr_err("Couldn't run cached device %s", > dc->backing_dev_name); > return ret; > } > ``` > >> >>> In bch_sectors_dirty_init(), all inode <= d->id keys will be recounted >>> again. This is wrong, we only need to count the keys of the current >>> device. >>> >>> Fixes: b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded") >>> Signed-off-by: Mingzhe Zou <mingzhe.zou@xxxxxxxxxxxx> >>> --- >>> drivers/md/bcache/writeback.c | 7 ++++++- >>> 1 file changed, 6 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c >>> index 24c049067f61..71d0dabcbf9d 100644 >>> --- a/drivers/md/bcache/writeback.c >>> +++ b/drivers/md/bcache/writeback.c >>> @@ -983,6 +983,8 @@ void bch_sectors_dirty_init(struct bcache_device *d) >>> struct cache_set *c = d->c; >>> struct bch_dirty_init_state state; >>> >>> + atomic_long_set(&d->dirty_sectors, 0); >>> + >> >> The above change is not upstreamed yet, if you are fixing upstream code please >> avoid to use d->dirty_sectors here. >> > > Yes, dirty_sectors is a set of resize patches submitted to the community before, > these patches have not been merged into upstream, I will remove this line in v2. > > In fact, the change about dirty_sectors is only a prerequisite for resize, and it can > be promoted first. It will greatly reduce the memory requirements of high-capacity > devices. > >> >> >>> /* Just count root keys if no leaf node */ >>> rw_lock(0, c->root, c->root->level); >>> if (c->root->level == 0) { >>> @@ -991,8 +993,11 @@ void bch_sectors_dirty_init(struct bcache_device *d) >>> op.count = 0; >>> >>> for_each_key_filter(&c->root->keys, >>> - k, &iter, bch_ptr_invalid) >>> + k, &iter, bch_ptr_invalid) { >>> + if (KEY_INODE(k) != op.inode) >>> + continue; >>> sectors_dirty_init_fn(&op.op, c->root, k); >>> + } >>> >> >> Nice catch! IMHO if I take the above change, setting d->dirty_sectors by 0 >> might be unncessary in ideal condition, am I right? >> > > In bch_cached_dev_attach () may still fail and exit, I think it is necessary to clear 0. Copied. Thanks for the information, I will take the v2 patch. Coly Li