Re: Bcache stuck at writeback of a key, consuming 100% CPU, not possible to detach

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 31, 2015 at 06:45:31PM +0200, Vojtech Pavlik wrote:
> On Mon, Aug 31, 2015 at 07:04:29AM -0800, Kent Overstreet wrote:
> 
> > I suspect there's two different bugs here.
> > 
> >  - I'm starting to suspect there's a bug in the dirty data accounting, and it's
> >    getting out of sync - i.e. reading 2.8 GB or whatever when it's actually 0.
> >    that would explain it spinning when there actually isn't any work for it to
> >    do.
> 
> That may be the case, but doesn't quite match my observation. Using this
> command line:
> 
> 	echo 2 > writeback_percent; echo 0 > writeback_percent; echo 100000 > writeback_rate; echo none > cache_mode; while true; do if top -b -n 1 | grep 'R.*bcache_write'; then date; echo looping; echo writeback > cache_mode; echo 40 > writeback_percent; sleep 1; echo 2 > writeback_percent; echo 0 > writeback_percent; echo 100000 > writeback_rate; echo none > cache_mode; echo fixed; cat /sys/block/bcache0/bcache/dirty_data; fi; done
> 
> I managed to get down to about 200 MB od dirty data reported. If the
> reporting was off by a fixed offset, I wouldn't be getting the 100% CPU
> and running bcache_writeback at 5GB of dirty data already.
> 
> At least unless the accounting of dirty data is very wrong and
> fluctuating.

Huh, that's good to know then.

> >  - with a large enough amount of data, the 30 second writeback_delay may be
> >    insufficient; if it takes longer than that just to scan the entire keyspace
> >    it'll never get a chance to sleep. try bumping writeback_delay up and see if
> >    that helps.
> 
> That shouldn't be the case when the amount of dirty data is below a
> gigabyte, or is it?

No - it has to scan the entire btree, cached _and_ dirty data - so the scanning
gets expensive when you have lots of clean cached data and very little dirty
data, so it's supposed to ratelimit to no more than one scan every 30 seconds
(IIRC; that algorithm has gone through a couple different iterations). But if
it's taking more than 30 seconds to complete one scan... well, you see the
problem?
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux