On 2017/7/12 上午8:34, tang.junhui@xxxxxxxxxx wrote: >> If a read bio to cache device gets failed, bcache will try to recovery it >> by forward the read bio to backing device. If backing device responses >> read request successfully then the bio contains data from backing device >> will be returned to uppper layer. >> >> The recovery effort in cached_dev_read_error() is not correct, and there >> is report that corrupted data may returned when a dirty cache device goes >> offline during reading I/O. >> >> For writeback cache mode, before dirty data are wrote back to backing >> device, data blocks on backing device are not updated and consistent. If >> a dirty cache device dropped and a read bio gets failed, bcache will >> return its stale version from backing device. This is mistaken behavior >> that applications don't expected, especially for data base workload. >> >> This patch fixes the issue by only permit recoverable I/O when cached >> device is in writethough mode, and s->recoverable is set. For other cache >> mode, recovery I/O failure by reading backing device does not make sense, >> bache just simply returns -EIO immediately. >> >> Reported-by: Arne Wolf <awolf@xxxxxxxxxx> >> Signed-off-by: Coly Li <colyli@xxxxxxx> >> --- >> drivers/md/bcache/request.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c >> index 019b3df9f1c6..6edacac9b00d 100644 >> --- a/drivers/md/bcache/request.c >> +++ b/drivers/md/bcache/request.c >> @@ -702,8 +702,11 @@ static void cached_dev_read_error(struct closure > *cl) >> { >> struct search *s = container_of(cl, struct search, cl); >> struct bio *bio = &s->bio.bio; >> + struct cached_dev *dc = container_of(s->d, struct > cached_dev, disk); >> + unsigned mode = cache_mode(dc, NULL); >> >> - if (s->recoverable) { >> + if (s->recoverable && >> + (mode == CACHE_MODE_WRITETHROUGH)) { >> /* Retry from the backing device: */ >> trace_bcache_read_retry(s->orig_bio); >> >> -- > > No, Poly, > Cache mode can change dynamically, > So maybe the bcache device is still dirty > when the cache mode is changed to CACHE_MODE_WRITETHROUGH, > so I think this patch can not solve this question fundamentally. Hi Junnhui, Yes I agree with you, if cache mode is switched from writeback to writethrough or writearound mode, and then the cache device is disconnected, this patch does not help us. When I replied question from Kai Krakow, I expressed similar opinion as yours. This patch is a best-effort fix, it is very necessary for data base applications which always use writeback mode and not switch to other mode during all their online time. Thanks. Coly