On Mon, Aug 31, 2015 at 04:39:37PM +0200, Emmanuel Florac wrote: > > Then I noticed that during those situations where the system was > > slow, and processes stuck in D, bcache_writeback CPU usage was > > soaring all the way to saturating a core, > > In my experience, bcache_writeback stays in Wait state, therefore > always saturate a core: any machine I'm running bcache on has a > constant load of 1.00 even when completely idle. In this situation, I see it in an "R" state. > > showing this backtrace, > > spending time in refill_keybuf_fn(): > <snip> > > Changing the configuration to writeback_percent=40 helped. For some > > time at least. > > > > When the issue returned, without any further changes to the system, I > > started investigating deeper. Since writeback_percent was large, also > > the amount of dirty data was large. > > In my case, when dirty data reaches the upper limit (i.e. when the > amount of dirty data equals the writeback_percent * backing device > size ), and it occurs regularly, the system just freezes... That may be a similar symptom. > > Before poking deeper, I decided I > > want to clear the dirty data entierly. So I set the system to > > cache_mode=writethrough and watched the dirty data trickle to the > > backing device. > > > > But then it stopped at 2.8G and didn't progress any further. The > > bcache_writeback thread was at 100% CPU usage again and system was > > near unusable. Reverting to writeback made the system responsive > > again. > > The bcache_writeback stays at 100% _even_ when in writethrough mode, > alas. So this looks normal. However dirty_data definitely should drop > to zero... This most certainly isn't normal. The ftrace shows it's looping in a loop doing nothing useful. > <snip> > > I consider this a rather serious bug, even though it is most likely > > caused by the cache device being corrupted. Any hints? > > Did you check what "smartctl -a" has to say about your backing device, > and maybe your spinning drives too? Just in case... Yes, they're fine. The backing device is a RAID5. -- Vojtech Pavlik Director SuSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html