On 2017/7/13 下午12:13, Eric Wheeler wrote: > On Tue, 11 Jul 2017, Coly Li wrote: > >> On 2017/7/11 下午1:39, tang.junhui@xxxxxxxxxx wrote: >>> Compared to bucket depletion, resulting in hanging dead, >>> It is worthy to consumes a little time to update the bucket_in_use. >>> If you have any better solution, please show to us, >>> We should solve it as soon as possible, not wait for it forever. >>> >> >> I also test this patch on a cache device with 4x3.8TB size, all buckets >> iteration takes around 40-50ms. If the iteration needs to hold >> bucket_lock of cache set, it is very probably to introduce a huge I/O >> latency in period of every 30 seconds. >> >> For database people, this is not good news. > > > Hi Tang, > > I'm waiting to queue this patch pending your response to Coly. > > Please send a v2 when you're ready. Eric, I guess Tang is working on the I/O hang issue during back ground garbage collection running. From discussion from other email thread, it seems a regular I/O request gets hung for 10+ second in some cases. Maybe that issue is more urgent than this one. >From my personal opinion, updating bucket_in_use is for acting garbage collection. If number of bucket in use is not updated in time, garbage collection won't start due to old bucket_in_use still beyond CUTOFF_WRITEBACK_SYNC. We may maintain an atomic counter per-cache set for dirty buckets, and update it at some locations when allocating or reclaiming bucket. This counter is unnecessary to be very accurate, just accurate enough for should_writeback() working correctly. I am also looking at it for a better solution as well. Coly >> >> Coly >> >> >>> >>> >>> >>> 发件人: Coly Li <i@xxxxxxx> >>> 收件人: linux-block@xxxxxxxxxxxxxxx, Tang Junhui >>> <tang.junhui@xxxxxxxxxx>, >>> 抄送: bcache@xxxxxxxxxxxxxxxxxx, linux-bcache@xxxxxxxxxxxxxxx, >>> hch@xxxxxxxxxxxxx, axboe@xxxxxxxxx >>> 日期: 2017/07/11 13:06 >>> 主题: Re: [PATCH 12/19] bcache: update bucket_in_use periodically >>> 发件人: linux-bcache-owner@xxxxxxxxxxxxxxx >>> ------------------------------------------------------------------------ >>> >>> >>> >>> On 2017/7/1 上午4:43, bcache@xxxxxxxxxxxxxxxxxx wrote: >>>> From: Tang Junhui <tang.junhui@xxxxxxxxxx> >>>> >>>> bucket_in_use is updated in gc thread which triggered by invalidating or >>>> writing sectors_to_gc dirty data, It's been too long, Therefore, when we >>>> use it to compare with the threshold, it is often not timely, which leads >>>> to inaccurate judgment and often results in bucket depletion. >>>> >>>> Signed-off-by: Tang Junhui <tang.junhui@xxxxxxxxxx> >>>> --- >>>> drivers/md/bcache/btree.c | 29 +++++++++++++++++++++++++++-- >>>> 1 file changed, 27 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c >>>> index 866dcf7..77aa20b 100644 >>>> --- a/drivers/md/bcache/btree.c >>>> +++ b/drivers/md/bcache/btree.c >>>> @@ -90,6 +90,8 @@ >>>> #define MAX_NEED_GC 64 >>>> #define MAX_SAVE_PRIO 72 >>>> >>>> +#define GC_THREAD_TIMEOUT_MS (30 * 1000) >>>> + >>>> #define PTR_DIRTY_BIT (((uint64_t) 1 >>> << 36)) >>>> >>>> #define PTR_HASH(c, k) >>> \ >>>> @@ -1760,6 +1762,23 @@ static void bch_btree_gc(struct cache_set *c) >>>> bch_moving_gc(c); >>>> } >>>> >>>> +void bch_update_bucket_in_use(struct cache_set *c) >>>> +{ >>>> + struct cache *ca; >>>> + struct bucket *b; >>>> + unsigned i; >>>> + size_t available = 0; >>>> + >>>> + for_each_cache(ca, c, i) { >>>> + for_each_bucket(b, ca) { >>>> + if (!GC_MARK(b) || >>> GC_MARK(b) == GC_MARK_RECLAIMABLE) >>>> + >>> available++; >>>> + } >>>> + } >>>> + >>> >>> bucket_lock of cache set should be held before accessing buckets. >>> >>> >>>> + c->gc_stats.in_use = (c->nbuckets - available) * 100 >>> / c->nbuckets; >>>> +} >>>> + >>>> static bool gc_should_run(struct cache_set *c) >>>> { >>>> struct cache *ca; >>>> @@ -1778,10 +1797,16 @@ static bool gc_should_run(struct cache_set *c) >>>> static int bch_gc_thread(void *arg) >>>> { >>>> struct cache_set *c = arg; >>>> + long ret; >>>> + unsigned long timeout = >>> msecs_to_jiffies(GC_THREAD_TIMEOUT_MS); >>>> >>>> while (1) { >>>> - wait_event_interruptible(c->gc_wait, >>>> - >>> kthread_should_stop() || gc_should_run(c)); >>>> + ret = >>> wait_event_interruptible_timeout(c->gc_wait, >>>> + >>> kthread_should_stop() || gc_should_run(c), timeout); >>>> + if (!ret) { >>>> + >>> bch_update_bucket_in_use(c); >>>> + continue; >>> >>> A continue here will ignore status returned from kthread_should_stop(), >>> which might not be expected behavior. >>> >>> >>>> + } >>>> >>>> if (kthread_should_stop()) >>>> break; >>>> >>> >>> Iterating all buckets from the cache set requires bucket_lock to be >>> held. Waiting for bucket_lock may take quite a long time for either >>> bucket allocating code or bch_gc_thread(). What I concern is, this patch >>> may introduce bucket allocation delay in period of GC_THREAD_TIMEOUT_MS. >>> >>> We need to find out a way to avoid such a performance regression. >>> >>> -- >>> Coly Li >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >> >> >> -- >> Coly Li -- Coly Li -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html