Re: DIF/DIX issue related to config CONFIG_SCSI_MQ_DEFAULT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 28, 2018 at 11:37:23AM +0800, chenxiang (M) wrote:
> Hi Lei Ming,
> 
> 在 2018/11/27 21:08, Ming Lei 写道:
> > On Tue, Nov 27, 2018 at 05:55:45PM +0800, chenxiang (M) wrote:
> > > Hi all,
> > > 
> > > There is a issue which may be related to CONFIG_SCSI_MQ_DEFAULT: before we
> > > developed DIF/DIX feature on kernel 4.18 (disable CONFIG_SCSI_MQ_DEFAULT
> > > default), and
> > > it works well.
> > I guess you are testing hisi_sas_v3_hw, does 4.18 work with
> > 'scsi_mod.use_blk_mq=Y'? If yes, you may run 'git bisect' to figure out
> > which commit is the 1st bad one.
> > 
> > > But when we switch to kernel 4.19-rc1 and 4.20-rc1, Call
> > > trace as follow occurs when running fio and if disable config
> > > CONFIG_SCSI_MQ_DEFAULT,
> > > then it works well. Also if switch ioengine=libaio to ioengine=psync, it
> > > seems also work well. Do you have any idea or encounter similar issue?
> > I tested scsi-debug via 'dix=1 dif=1', looks everything is fine, are you
> > using direct io or not?
> > 
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > job1: (g=0): rw=rw, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128
> > > fio 2.0.5
> > > Starting 12 processes
> > > [  629.210506] Unable to handle kernel paging request at virtual address
> > > 0000ffff8027e048
> > > [  629.210506] Unable to handle kernel paging request at virtual address
> > > 0000ffff8027e048
> > > [  629.226373] Mem abort info:
> > > [  629.226373] Mem abort info:
> > > [  629.231952]   ESR = 0x96000006
> > > [  629.231952]   ESR = 0x96000006
> > > [  629.238052]   Exception class = DABT (current EL), IL = 32 bits
> > > [  629.238052]   Exception class = DABT (current EL), IL = 32 bits
> > > [  629.249898]   SET = 0, FnV = 0
> > > [  629.249898]   SET = 0, FnV = 0
> > > [  629.255998]   EA = 0, S1PTW = 0
> > > [  629.255998]   EA = 0, S1PTW = 0
> > > [  629.262272] Data abort info:
> > > [  629.262272] Data abort info:
> > > [  629.268023]   ISV = 0, ISS = 0x00000006
> > > [  629.268023]   ISV = 0, ISS = 0x00000006
> > > [  629.275690]   CM = 0, WnR = 0
> > > [  629.275690]   CM = 0, WnR = 0
> > > [  629.281617] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000085c91728
> > > [  629.281617] user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000085c91728
> > > [  629.294857] [0000ffff8027e048] pgd=00000027a8644003,
> > > pud=00000027a85ea003, pmd=0000000000000000
> > > [  629.294857] [0000ffff8027e048] pgd=00000027a8644003,
> > > pud=00000027a85ea003, pmd=0000000000000000
> > > [  629.312278] Internal error: Oops: 96000006 [#1] PREEMPT SMP
> > > [  629.312278] Internal error: Oops: 96000006 [#1] PREEMPT SMP
> > > [  629.323427] Modules linked in: hisi_sas_v3_hw [last unloaded:
> > > hisi_sas_v3_hw]
> > > [  629.323427] Modules linked in: hisi_sas_v3_hw [last unloaded:
> > > hisi_sas_v3_hw]
> > > [  629.337713] CPU: 13 PID: 4465 Comm: fio Not tainted
> > > 4.20.0-rc1-15093-ge876dec #1067
> > > [  629.337713] CPU: 13 PID: 4465 Comm: fio Not tainted
> > > 4.20.0-rc1-15093-ge876dec #1067
> > > [  629.353040] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 -
> > > B601 (V6.01) 11/08/2018
> > > [  629.353040] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 -
> > > B601 (V6.01) 11/08/2018
> > > [  629.370633] pstate: 80400009 (Nzcv daif +PAN -UAO)
> > > [  629.370633] pstate: 80400009 (Nzcv daif +PAN -UAO)
> > > [  629.380218] pc : deadline_remove_request+0x2c/0xd0
> > > [  629.380218] pc : deadline_remove_request+0x2c/0xd0
> > Could you use gdb to find where 'deadline_remove_request+0x2c' points
> > to?
> 
> From objdump, 'deadline_remove_request+0x2c' is on the function __list_del
> -> INIT_LIST_HEAD.

You may enable 'Kernel hacking/Debug linked list manipulation' config
option and see what the dumped log is.

Also it might be related with the following recent report too:

https://marc.info/?l=linux-scsi&m=154283686812846&w=2

Thanks,
Ming



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux