Re: [PATCH v7 4/9] bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags

Michael Lyle <mlyle@xxxxxxxx> · Tue, 27 Feb 2018 10:07:05 -0800

Hi Coly Li--

On 02/27/2018 08:55 AM, Coly Li wrote:
> When too many I/Os failed on cache device, bch_cache_set_error() is called
> in the error handling code path to retire whole problematic cache set. If
> new I/O requests continue to come and take refcount dc->count, the cache
> set won't be retired immediately, this is a problem.
> 
> Further more, there are several kernel thread and self-armed kernel work
> may still running after bch_cache_set_error() is called. It needs to wait
> quite a while for them to stop, or they won't stop at all. They also
> prevent the cache set from being retired.

It's too bad this is necessary-- I wish the IO layer could latch error
for us in some kind of meaningful way instead of us having to do it
ourselves (and for filesystems, etc, having to each do similar things to
prevent just continuously hitting IO timeouts).  That said, the code
looks good.

Reviewed-by: Michael Lyle <mlyle@xxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html