On Sun, 6 May 2018, Coly Li wrote: > On 2018/5/6 3:28 AM, Eric Wheeler wrote: > > On Sun, 6 May 2018, Coly Li wrote: > > > >> On 2018/4/27 4:32 AM, Eric Wheeler wrote: > >>>> Hi Coly, > >>>> > >>>> I'm sure you've been busy with the v4.17 merge but I thought I > >>>> would check in: > >>>> > >>>> Have you had a chance to look at this? It is an opportunity to fix this > >>>> 4k bug for the future since we can still reproduce the error. > >>>> > >>>>>>> bcache: bch_count_io_errors() dm-6: IO error on reading from cache, recovering <<< > >>>> > >>>> What do you think, is there data corruption exposure here since 4.1.49 > >>>> still has the dirty-cache-recorvery bug? > >>> > >>> I just noticed that "bcache: only permit to recovery read error when cache > >>> device is clean" is in v4.1.49, but would this recover gracefully in the > >>> 4k error situation? > >>> > >>>> Also, would your failure-recovery patch series address this type of > >>>> failure? > >> > >> Hi Eric, > >> > >> I just find a time slot to compose a patch checking 4K alignment of I/Os > >> to backing device. After testing and first glance at the messages, I > >> will post it out. > >> > > > > Thank you Coly, I appreciate your help! > > Hi Eric, > > Please check the attached patch, it checks bcache key offset, if the > offset is not 4KB aligned, a call trace will be printed out. > > I also run it with my own hardware, here is some information I may share. > > It seems bcache just tries to cache bio with any offset, no 4K alignment > required. When I use fio with directIO, non-4K aligned bio can be sent > into bcache code and it is just cached. > > I can see numerous call trace when I use directIO with 512B/1KB/2KB > block size. But if I use 4KB block size in fio, or set block alignment > to 4KB, no warning call trace printed, not at all. > > Then when I set block alignment to 2K in fio, even blocksize is 4KB, I > can see non-4k-aligned warning. > > Therefore I guess the most probably reason is, the upper layer code > sends non-4k-aligned bio into bcache code. > > I also tried to set bcache block size to 4K with make-bcache -w, when > fio blocksize >= 4KB, no non-4k-aligned warning. But fio does not work > if its blocksize < bcache block size, I am not sure whether setting > bcache block size to 4K works to your situation. > Hi Coly, We are already running 4k blocks with bcache. We tried your new patch, but there aren't any new backtraces. This is the only information we get: [ 1174.675229] sd 0:0:0:1: [sdb] Unaligned block number requested: sector_size=4096, block=15724561783, blk_rq=41 [ 1174.676077] sd 0:0:0:2: [sdc] Unaligned block number requested: sector_size=4096, block=353041024, blk_rq=23 [ 1174.676958] bcache: bch_count_io_errors() dm-6: IO error on reading from cache, recovering [ 1174.710818] block drbd8065: read: error=-5 s=19281408s [ 1174.711654] block drbd8065: Local IO failed in drbd_endio_read_sec_final. [ 1175.278348] sd 0:0:0:1: [sdb] Unaligned block number requested: sector_size=4096, block=15725385535, blk_rq=97 [ 1175.279612] sd 0:0:0:2: [sdc] Unaligned block number requested: sector_size=4096, block=387084760, blk_rq=31 [ 1175.280834] bcache: bch_count_io_errors() dm-6: IO error on reading from cache, recovering [ 1175.282232] block drbd8065: read: error=-5 s=19391360s [ 1175.283335] block drbd8065: Local IO failed in drbd_endio_read_sec_final. Note that we run this without your original patch, so the original backtrace still stands. Does this mean there is something in my cache that is less than 4k? -Eric -- Eric Wheeler > Just for your information. > > Coly Li > -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html