On Wed, Nov 28, 2018 at 11:37:23AM +0800, chenxiang (M) wrote: > Hi Lei Ming, > > 在 2018/11/27 21:08, Ming Lei 写道: > > On Tue, Nov 27, 2018 at 05:55:45PM +0800, chenxiang (M) wrote: > > > Hi all, > > > > > > There is a issue which may be related to CONFIG_SCSI_MQ_DEFAULT: before we > > > developed DIF/DIX feature on kernel 4.18 (disable CONFIG_SCSI_MQ_DEFAULT > > > default), and > > > it works well. > > I guess you are testing hisi_sas_v3_hw, does 4.18 work with > > 'scsi_mod.use_blk_mq=Y'? If yes, you may run 'git bisect' to figure out > > which commit is the 1st bad one. > > > > > But when we switch to kernel 4.19-rc1 and 4.20-rc1, Call > > > trace as follow occurs when running fio and if disable config > > > CONFIG_SCSI_MQ_DEFAULT, > > > then it works well. Also if switch ioengine=libaio to ioengine=psync, it > > > seems also work well. Do you have any idea or encounter similar issue? > > I tested scsi-debug via 'dix=1 dif=1', looks everything is fine, are you > > using direct io or not? > > > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 > > > fio 2.0.5 > > > Starting 12 processes > > > [ 629.210506] Unable to handle kernel paging request at virtual address > > > 0000ffff8027e048 > > > [ 629.210506] Unable to handle kernel paging request at virtual address > > > 0000ffff8027e048 > > > [ 629.226373] Mem abort info: > > > [ 629.226373] Mem abort info: > > > [ 629.231952] ESR = 0x96000006 > > > [ 629.231952] ESR = 0x96000006 > > > [ 629.238052] Exception class = DABT (current EL), IL = 32 bits > > > [ 629.238052] Exception class = DABT (current EL), IL = 32 bits > > > [ 629.249898] SET = 0, FnV = 0 > > > [ 629.249898] SET = 0, FnV = 0 > > > [ 629.255998] EA = 0, S1PTW = 0 > > > [ 629.255998] EA = 0, S1PTW = 0 > > > [ 629.262272] Data abort info: > > > [ 629.262272] Data abort info: > > > [ 629.268023] ISV = 0, ISS = 0x00000006 > > > [ 629.268023] ISV = 0, ISS = 0x00000006 > > > [ 629.275690] CM = 0, WnR = 0 > > > [ 629.275690] CM = 0, WnR = 0 > > > [ 629.281617] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000085c91728 > > > [ 629.281617] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000085c91728 > > > [ 629.294857] [0000ffff8027e048] pgd=00000027a8644003, > > > pud=00000027a85ea003, pmd=0000000000000000 > > > [ 629.294857] [0000ffff8027e048] pgd=00000027a8644003, > > > pud=00000027a85ea003, pmd=0000000000000000 > > > [ 629.312278] Internal error: Oops: 96000006 [#1] PREEMPT SMP > > > [ 629.312278] Internal error: Oops: 96000006 [#1] PREEMPT SMP > > > [ 629.323427] Modules linked in: hisi_sas_v3_hw [last unloaded: > > > hisi_sas_v3_hw] > > > [ 629.323427] Modules linked in: hisi_sas_v3_hw [last unloaded: > > > hisi_sas_v3_hw] > > > [ 629.337713] CPU: 13 PID: 4465 Comm: fio Not tainted > > > 4.20.0-rc1-15093-ge876dec #1067 > > > [ 629.337713] CPU: 13 PID: 4465 Comm: fio Not tainted > > > 4.20.0-rc1-15093-ge876dec #1067 > > > [ 629.353040] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - > > > B601 (V6.01) 11/08/2018 > > > [ 629.353040] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - > > > B601 (V6.01) 11/08/2018 > > > [ 629.370633] pstate: 80400009 (Nzcv daif +PAN -UAO) > > > [ 629.370633] pstate: 80400009 (Nzcv daif +PAN -UAO) > > > [ 629.380218] pc : deadline_remove_request+0x2c/0xd0 > > > [ 629.380218] pc : deadline_remove_request+0x2c/0xd0 > > Could you use gdb to find where 'deadline_remove_request+0x2c' points > > to? > > From objdump, 'deadline_remove_request+0x2c' is on the function __list_del > -> INIT_LIST_HEAD. You may enable 'Kernel hacking/Debug linked list manipulation' config option and see what the dumped log is. Also it might be related with the following recent report too: https://marc.info/?l=linux-scsi&m=154283686812846&w=2 Thanks, Ming