Re: Writeback cache all used.

Coly Li <colyli@xxxxxxx> · Fri, 12 May 2023 13:13:06 +0800

On Thu, May 11, 2023 at 04:10:51PM -0700, Eric Wheeler wrote:
> On Tue, 9 May 2023, Adriano Silva wrote:
> > I got the parameters with this script, although I also checked / sys, doing the math everything is right.
> > 
> > https://gist.github.com/damoxc/6267899
> 
> Thanks.  prio_stats gives me what I'm looking for.  More below.
>  
> > Em segunda-feira, 8 de maio de 2023 às 21:42:26 BRT, Eric Wheeler <bcache@xxxxxxxxxxxxxxxxxx> escreveu: 
> > On Thu, 4 May 2023, Coly Li wrote:
> > > > 2023年5月3日 04:34，Eric Wheeler <bcache@xxxxxxxxxxxxxxxxxx> 写道：
> > > > 
> > > > On Thu, 20 Apr 2023, Adriano Silva wrote:
> > > >> I continue to investigate the situation. There is actually a performance 
> > > >> gain when the bcache device is only half filled versus full. There is a 
> > > >> reduction and greater stability in the latency of direct writes and this 
> > > >> improves my scenario.
> > > > 
> > > > Hi Coly, have you been able to look at this?
> > > > 
> > > > This sounds like a great optimization and Adriano is in a place to test 
> > > > this now and report his findings.
> > > > 
> > > > I think you said this should be a simple hack to add early reclaim, so 
> > > > maybe you can throw a quick patch together (even a rough first-pass with 
> > > > hard-coded reclaim values)
> > > > 
> > > > If we can get back to Adriano quickly then he can test while he has an 
> > > > easy-to-reproduce environment.  Indeed, this could benefit all bcache 
> > > > users.
> > > 
> > > My current to-do list on hand is a little bit long. Yes I’d like and 
> > > plan to do it, but the response time cannot be estimated.
> > 
> > I understand.  Maybe I can put something together if you can provide some 
> > pointers since you are _the_ expert on bcache these days.  Here are a few 
> > questions:
> > 
> > Q's for Coly:
> 
> 
> It _looks_ like bcache frees buckets while the `ca->free_inc` list is 
> full, but it could go further.  Consider this hypothetical:
> 

Hi Eric,

Bcache starts to invalidate bucket when ca->free_inc is full, and selects some
buckets to be invalidate by the replacement policy. Then continues to call
bch_invalidate_one_bucket() and pushes the invalidated bucket into ca->free_inc
until this list is full or no more candidate bucket to invalidate.

> https://elixir.bootlin.com/linux/v6.4-rc1/source/drivers/md/bcache/alloc.c#L179
> 
> 	static void invalidate_buckets_lru(struct cache *ca)
> 	{
> 	...
> +	int available = 0;
> +	mutex_lock(&ca->set->bucket_lock);

the mutex_lock()/unlock may introduce deadlock. Before invadliate_buckets() is
called, after allocator_wait() returns the mutex lock bucket_lock is held again. 

> +	for_each_bucket(b, ca) {
> +		if (!GC_SECTORS_USED(b))
> +			unused++;
> +		if (GC_MARK(b) == GC_MARK_RECLAIMABLE)
> +			available++;
> +		if (GC_MARK(b) == GC_MARK_DIRTY)
> +			dirty++;
> +		if (GC_MARK(b) == GC_MARK_METADATA)
> +			meta++;
> +	}
> +	mutex_unlock(&ca->set->bucket_lock);
> +
> -	while (!fifo_full(&ca->free_inc)) {
> +	while (!fifo_full(&ca->free_inc) || available < TARGET_AVAIL_BUCKETS) {

If ca->free_inc is full, and you still try to invalidate more candidate buckets, the
following selected bucket (by the heap_pop) will be invalidate in
bch_invalidate_one_bucket() and pushed into ca->free_inc. But now ca->free_inc is
full, so next time when invalidate_buckets_lru() is called again, this already
invalidated bucket will be accessed and checked again in for_each_bucket(). This is
just a waste of CPU cycles.

Further more, __bch_invalidate_one_bucket() will include the bucket's gen number and
its pin counter. Doing this without pushing the bucket into ca->free_inc, makes me
feel uncomfortable.

> 		if (!heap_pop(&ca->heap, b, bucket_min_cmp)) {

> 			/*
> 			 * We don't want to be calling invalidate_buckets()
> 			 * multiple times when it can't do anything
> 			 */
> 			ca->invalidate_needs_gc = 1;
> 			wake_up_gc(ca->set);
> 			return;
> 		}
> 
> 		bch_invalidate_one_bucket(ca, b);  <<<< this does the work
> 	}
> 
> (TARGET_AVAIL_BUCKETS is a placeholder, ultimately it would be a sysfs 
> setting, probably a percentage.)
> 
> 
> Coly, would this work?
> 

It should work on some kind of workloads, but will introduce complains for other kind of workloads.

> Can you think of any serious issues with this (besides the fact that for_each_bucket is slow)?
> 

I don't feel this change may help to make bcache invalidate the clean buckets without extra cost.

It is not simple for me to tell a solution without careful thought, this is a tradeoff of gain and
pay...

Thanks.

[snipped]

-- 
Coly Li