RE: IO error on DIF/DIX supported array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Martin,
Any inputs on this one?

Thanks,
~Saurav

> -----Original Message-----
> From: Saurav Kashyap
> Sent: Tuesday, February 7, 2023 4:50 PM
> To: Martin K. Petersen <martin.petersen@xxxxxxxxxx>
> Cc: linux-scsi <linux-scsi@xxxxxxxxxxxxxxx>; Girish Basrur
> <gbasrur@xxxxxxxxxxx>
> Subject: IO error on DIF/DIX supported array
> 
> Hi Martin,
> We have observed IO failure on 3PAR array that supports DIF/DIX with
> upstream code. An error is only seen when IOs are done on DM devices, no
> error observed if IO is done on /dev/sdX.
> I added some prints to understand the problem and figured out that
> SCSI_PROT_IP_CHECKSUM flag is not set in scmnd->prot_flags. Ideally it
> should be set as BIP_IP_CHECKSUM should be set.
> 
> --------------------<START: IO to /dev/sdc>----------------
> [Mon Feb 6 17:54:56 2023] SK: bio_integrity_prep setting IP_CHECKSUM
> bio=ffff976f8d19c300 bip_flags=0x11
> [Mon Feb 6 17:54:56 2023] SK: sd_setup_protect_cmnd setting
> IP_CHECKSUM bio=ffff976f8d19c300 bip_flags=0x11
> [Mon Feb 6 17:54:56 2023] SK: bio_integrity_prep setting IP_CHECKSUM
> bio=ffff976f8d19c300 bip_flags=0x11
> [Mon Feb 6 17:54:56 2023] SK: sd_setup_protect_cmnd setting
> IP_CHECKSUM bio=ffff976f8d19c300 bip_flags=0x11
> -------------------<END: IO to /dev/sdc>-----------------
> 
> ----------------<START: IO to dm-10>---------------------
> [Mon Feb 6 17:55:13 2023] SK: bio_integrity_prep setting IP_CHECKSUM
> bio=ffff976f8d19c300 bip_flags=0x11
> [Mon Feb 6 17:55:13 2023] SK: sd_setup_protect_cmnd else IP_CHECKSUM
> bio=ffff976fa15fa490 bip_flags=0x0
> [Mon Feb 6 17:55:13 2023] dm-10: guard tag error at sector 0 (rcvd 0000, want
> ffff)
> [Mon Feb 6 17:55:13 2023] SK: bio_integrity_prep setting IP_CHECKSUM
> bio=ffff978f0752c180 bip_flags=0x11
> [Mon Feb 6 17:55:13 2023] SK: sd_setup_protect_cmnd else IP_CHECKSUM
> bio=ffff976fc87fef10 bip_flags=0x0
> [Mon Feb 6 17:55:13 2023] dm-10: guard tag error at sector 0 (rcvd 0000, want
> ffff)
> [Mon Feb 6 17:55:13 2023] Buffer I/O error on dev dm-10, logical block 0,
> async page read
> -----------------<END: IO to dm-10>------------------------
> 
> Its noticed that bio pointer get changed when IO is done through dm device.
> I added more debug prints in bio_clone and bio_integrity_clone and
> concluded that bip_flags are not getting copied in bio_integrity_clone
> routine.
> 
> --------------------
> [Tue Feb  7 14:15:47 2023] SK: bio_integrity_prep setting IP_CHECKSUM
> bio=ffff891ecc5fa840 bip_flags=0x11
> [Tue Feb  7 14:15:47 2023] SK: __bio_clone: bio=ffff891ed97b5990
> bio_src=ffff891ecc5fa840
> [Tue Feb  7 14:15:47 2023] SK: bio_integrity_clone: bip=ffff891ecc5fd500
> bip_src=ffff891ecc5fcb40 bip_flags=0x0 src_bip_flags=0x11
> [Tue Feb  7 14:15:47 2023] SK: sd_setup_protect_cmnd else IP_CHECKSUM
> bio=ffff891ed97b5990 bip_flags=0x0
> [Tue Feb  7 14:15:47 2023] dm-3: guard tag error at sector 0 (rcvd 0000, want
> ffff)
> [Tue Feb  7 14:15:47 2023] Buffer I/O error on dev dm-3, logical block 0, async
> page read
> ----------------------------------
> 
> If I add the change to copy the flags, following  BUG_ON in slub.c is reported
> ------------------<code>-------------
> diff --git a/block/bio-integrity.c b/block/bio-integrity.c
> index 3f5685c00e36..07e7443c7be3 100644
> --- a/block/bio-integrity.c
> +++ b/block/bio-integrity.c
> @@ -418,6 +418,7 @@ int bio_integrity_clone(struct bio *bio, struct bio
> *bio_src,
> 
>         bip->bip_vcnt = bip_src->bip_vcnt;
>         bip->bip_iter = bip_src->bip_iter;
> +       bip->bip_flags = bip_src->bip_flags;
> 
>         return 0;
>  }
> ----------------<code>---------------
> 
> ------------------<BUG_ON>--------------
> [  751.838432] kernel BUG at mm/slub.c:435!
> [  751.838440] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [  751.838443] CPU: 49 PID: 981 Comm: kworker/49:1H Kdump: loaded Not
> tainted 6.2.0-rc1+ #14
> [  751.838447] Hardware name: Dell Inc. PowerEdge R7525/0590KW, BIOS
> 2.5.6 10/06/2021
> [  751.838448] Workqueue: kintegrityd bio_integrity_verify_fn
> [  751.838458] RIP: 0010:__slab_free+0x1ae/0x300
> [  751.838467] Code: 4c 89 e6 48 89 ef 5d 41 5c 41 5d 41 5e 41 5f e9 d8 fb ff ff
> 48 83 c4 60 4c 89 f7 5b 5d 41 5c 41 5d 41 5e 41 5f e9 62 3b 00 00 <0f> 0b 80 4c 24
> 4b 80 e9 ea fe ff ff 4c 89 fa 4d 89 d7 4c 8b 54 24
> [  751.838469] RSP: 0018:ffffbb674fcf7dd0 EFLAGS: 00010246
> [  751.838472] RAX: ffff9c320d3546e0 RBX: ffff9c325302e480 RCX:
> 000000008040003f
> [  751.838473] RDX: ffffffc10e1546c0 RSI: ffffdfb30434d500 RDI:
> ffff9c3200042500
> [  751.838475] RBP: ffff9c3200042500 R08: 0000000000000001 R09:
> ffffffffb4fbf08a
> [  751.838476] R10: ffffbb674fcf7ca0 R11: ffffffffb65e4ac8 R12:
> ffffdfb30434d500
> [  751.838477] R13: ffff9c320d3546c0 R14: ffff9c320d3546c0 R15:
> ffff9c320d3546c0
> [  751.838479] FS:  0000000000000000(0000) GS:ffff9c70ff840000(0000)
> knlGS:0000000000000000
> [  751.838481] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  751.838482] CR2: 00007fe84efedb00 CR3: 000000015472a000 CR4:
> 0000000000350ee0
> [  751.838484] Call Trace:
> [  751.838485]  <TASK>
> [  751.838487]  ? bio_integrity_process+0x14f/0x1c0
> [  751.838494]  ? __pfx_t10_pi_type1_verify_ip+0x10/0x10 [t10_pi]
> [  751.838501]  bio_integrity_free+0xaa/0xb0
> [  751.838504]  bio_integrity_verify_fn+0x40/0x50
> [  751.838507]  process_one_work+0x1e5/0x3b0
> [  751.838513]  ? __pfx_worker_thread+0x10/0x10
> [  751.838515]  worker_thread+0x50/0x3a0
> [  751.838518]  ? __pfx_worker_thread+0x10/0x10
> [  751.838520]  kthread+0xd9/0x100
> [  751.838525]  ? __pfx_kthread+0x10/0x10
> [  751.838528]  ret_from_fork+0x2c/0x50
> [  751.838535]  </TASK>
> ----------------------<BUG_ON>---------------
> 
> Queries
> 1) Is there a specific reason for not copying the bip_flags in
> bio_integrity_clone function?
> 2) If bip_flags needs to be copied then is there something else needs to be
> done that will take care of BUG_ON?
> 3) if not, then what should be right solution for fix an IO error because of
> SCSI_PROT_IP_CHECKSUM flag not set.
> 
> 
> Thanks,
> ~Saurav





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux