On 2018/5/8 6:41 AM, Eric Wheeler wrote: > On Sun, 6 May 2018, Coly Li wrote: > >> On 2018/5/6 3:28 AM, Eric Wheeler wrote: >>> On Sun, 6 May 2018, Coly Li wrote: >>> >>>> On 2018/4/27 4:32 AM, Eric Wheeler wrote: >>>>>> Hi Coly, >>>>>> >>>>>> I'm sure you've been busy with the v4.17 merge but I thought I >>>>>> would check in: >>>>>> >>>>>> Have you had a chance to look at this? It is an opportunity to fix this >>>>>> 4k bug for the future since we can still reproduce the error. >>>>>> >>>>>>>>> bcache: bch_count_io_errors() dm-6: IO error on reading from cache, recovering <<< >>>>>> >>>>>> What do you think, is there data corruption exposure here since 4.1.49 >>>>>> still has the dirty-cache-recorvery bug? >>>>> >>>>> I just noticed that "bcache: only permit to recovery read error when cache >>>>> device is clean" is in v4.1.49, but would this recover gracefully in the >>>>> 4k error situation? >>>>> >>>>>> Also, would your failure-recovery patch series address this type of >>>>>> failure? >>>> >>>> Hi Eric, >>>> >>>> I just find a time slot to compose a patch checking 4K alignment of I/Os >>>> to backing device. After testing and first glance at the messages, I >>>> will post it out. >>>> >>> >>> Thank you Coly, I appreciate your help! >> >> Hi Eric, >> >> Please check the attached patch, it checks bcache key offset, if the >> offset is not 4KB aligned, a call trace will be printed out. >> >> I also run it with my own hardware, here is some information I may share. >> >> It seems bcache just tries to cache bio with any offset, no 4K alignment >> required. When I use fio with directIO, non-4K aligned bio can be sent >> into bcache code and it is just cached. >> >> I can see numerous call trace when I use directIO with 512B/1KB/2KB >> block size. But if I use 4KB block size in fio, or set block alignment >> to 4KB, no warning call trace printed, not at all. >> >> Then when I set block alignment to 2K in fio, even blocksize is 4KB, I >> can see non-4k-aligned warning. >> >> Therefore I guess the most probably reason is, the upper layer code >> sends non-4k-aligned bio into bcache code. >> >> I also tried to set bcache block size to 4K with make-bcache -w, when >> fio blocksize >= 4KB, no non-4k-aligned warning. But fio does not work >> if its blocksize < bcache block size, I am not sure whether setting >> bcache block size to 4K works to your situation. >> > > Hi Coly, > > We are already running 4k blocks with bcache. We tried your new patch, but > there aren't any new backtraces. This is the only information we get: > > [ 1174.675229] sd 0:0:0:1: [sdb] Unaligned block number requested: sector_size=4096, block=15724561783, blk_rq=41 > [ 1174.676077] sd 0:0:0:2: [sdc] Unaligned block number requested: sector_size=4096, block=353041024, blk_rq=23 > [ 1174.676958] bcache: bch_count_io_errors() dm-6: IO error on reading from cache, recovering > [ 1174.710818] block drbd8065: read: error=-5 s=19281408s > [ 1174.711654] block drbd8065: Local IO failed in drbd_endio_read_sec_final. > [ 1175.278348] sd 0:0:0:1: [sdb] Unaligned block number requested: sector_size=4096, block=15725385535, blk_rq=97 > [ 1175.279612] sd 0:0:0:2: [sdc] Unaligned block number requested: sector_size=4096, block=387084760, blk_rq=31 > [ 1175.280834] bcache: bch_count_io_errors() dm-6: IO error on reading from cache, recovering > [ 1175.282232] block drbd8065: read: error=-5 s=19391360s > [ 1175.283335] block drbd8065: Local IO failed in drbd_endio_read_sec_final. > > > Note that we run this without your original patch, so the original > backtrace still stands. > > Does this mean there is something in my cache that is less than 4k? Hi Eric, The last patch I posted was to check LBA alignment for requests to backing device. It seems all the checking places does not find non-4k-aligned LBA. Let me compose an update version, to check LBA address 4k alignment when bio submitting to backing device. Then let's see what will happen. Thanks for the information ! Coly Li -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html