On 2018/5/9 3:59 PM, Kent Overstreet wrote: > Have you checked extent merging? > Hi Kent, Not yet. Let me look into it. Thanks for the hint. Coly Li > On Wed, May 9, 2018 at 3:36 AM, Coly Li <colyli@xxxxxxx> wrote: >> On 2018/5/9 12:57 AM, Eric Wheeler wrote: >>> On Tue, 8 May 2018, Coly Li wrote: >> >> >>> >>> Hi Coly, >>> >>> We did get traces over night, so hopefully these are useful. In summary, >>> these are the ones that hit: >>> >>> check_4k_alignment() KEY_OFFSET(&w->key) is not 4KB aligned >>> check_4k_alignment() KEY_OFFSET(l) + KEY_SIZE(r) is not 4KB aligned >>> check_4k_alignment() KEY_START(k) is not 4KB aligned >>> >>> The whole dmesg output that we have is here: https://pastebin.com/nuYFi66K >>> >>> And some of the traces separated by error message are shown below. The >>> ones below have a unique backtrace, but they may not cover all unique >>> backtraces. >>> >>> ==================================================================== >>> >>> Of those that hit, These are the ones that were accompanied by SCSI errors: >>> >>> [54947.892574] bcache: check_4k_alignment() KEY_OFFSET(&w->key) is not 4KB aligned: 15724561783 >>> [54947.893173] CPU: 5 PID: 1166 Comm: bcache_writebac Tainted: G O 4.1.49-5.el7.x86_64 #1 >>> [54947.893757] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.10 01/09/2014 >>> [54947.894323] 0000000000000286 8c136ca15cff4205 ffff8807ebea3d58 ffffffff816ff534 >>> [54947.894907] ffff88080a7b6aa0 ffff88080a7b0000 ffff8807ebea3d68 ffffffffa05beb63 >>> [54947.895515] ffff8807ebea3e08 ffffffffa05be174 00000003a93e4e90 ffff8807ef36c4c0 >>> [54947.896132] Call Trace: >>> [54947.896705] [<ffffffff816ff534>] dump_stack+0x63/0x81 >>> [54947.897285] [<ffffffffa05beb63>] check_4k_alignment.part.9+0x24/0x26 [bcache] >>> [54947.897853] [<ffffffffa05be174>] read_dirty+0x444/0x4a0 [bcache] >>> [54947.898418] [<ffffffffa05be1d0>] ? read_dirty+0x4a0/0x4a0 [bcache] >>> [54947.898980] [<ffffffffa05be5cc>] bch_writeback_thread+0x3fc/0x4e0 [bcache] >>> [54947.899544] [<ffffffffa05be1d0>] ? read_dirty+0x4a0/0x4a0 [bcache] >>> [54947.900121] [<ffffffff810c10d8>] kthread+0xd8/0xf0 >>> [54947.900673] [<ffffffff810c1000>] ? kthread_create_on_node+0x1b0/0x1b0 >>> [54947.901226] [<ffffffff817074d2>] ret_from_fork+0x42/0x70 >>> [54947.901783] [<ffffffff810c1000>] ? kthread_create_on_node+0x1b0/0x1b0 >>> [54947.902401] sd 0:0:0:2: [sdc] Unaligned block number requested: sector_size=4096, block=353041024, blk_rq=23 >>> [54947.903054] bcache: bch_count_io_errors() dm-6: IO error on reading dirty data from cache, recovering >>> [54947.903874] sd 0:0:0:1: [sdb] Unaligned block number requested: sector_size=4096, block=15724561760, blk_rq=23 >>> >>> >>> [54958.301274] bcache: check_4k_alignment() KEY_OFFSET(&w->key) is not 4KB aligned: 15725385535 >>> [54958.301889] CPU: 2 PID: 1166 Comm: bcache_writebac Tainted: G O 4.1.49-5.el7.x86_64 #1 >>> [54958.302532] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.10 01/09/2014 >>> [54958.303144] 0000000000000286 8c136ca15cff4205 ffff8807ebea3d58 ffffffff816ff534 >>> [54958.303805] ffff88080a7b7dc0 ffff88080a7b0000 ffff8807ebea3d68 ffffffffa05beb63 >>> [54958.304423] ffff8807ebea3e08 ffffffffa05be174 00000003a949ec10 ffff8807ef36c4c0 >>> [54958.305080] Call Trace: >>> [54958.305728] [<ffffffff816ff534>] dump_stack+0x63/0x81 >>> [54958.306371] [<ffffffffa05beb63>] check_4k_alignment.part.9+0x24/0x26 [bcache] >>> [54958.307049] [<ffffffffa05be174>] read_dirty+0x444/0x4a0 [bcache] >>> [54958.307694] [<ffffffffa05be1d0>] ? read_dirty+0x4a0/0x4a0 [bcache] >>> [54958.308338] [<ffffffffa05be5cc>] bch_writeback_thread+0x3fc/0x4e0 [bcache] >>> [54958.308986] [<ffffffffa05be1d0>] ? read_dirty+0x4a0/0x4a0 [bcache] >>> [54958.309631] [<ffffffff810c10d8>] kthread+0xd8/0xf0 >>> [54958.310267] [<ffffffff810c1000>] ? kthread_create_on_node+0x1b0/0x1b0 >>> [54958.310914] [<ffffffff817074d2>] ret_from_fork+0x42/0x70 >>> [54958.311533] [<ffffffff810c1000>] ? kthread_create_on_node+0x1b0/0x1b0 >>> [54958.312265] sd 0:0:0:2: [sdc] Unaligned block number requested: sector_size=4096, block=387084760, blk_rq=31 >>> [54958.313064] bcache: bch_count_io_errors() dm-6: IO error on reading dirty data from cache, recovering >>> [54958.314154] sd 0:0:0:1: [sdb] Unaligned block number requested: sector_size=4096, block=15725385504, blk_rq=31 >>> >> >> Hi Eric, >> >> Wow, the above lines are very informative, thanks! >> I will start to look into what happens here. And at the meantime I will >> compose another patch which does extra LBA 4k alignment check in >> make_request() entries, to make sure I don't miss anything. >> >> Coly Li -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html