Hi Martin, We have observed IO failure on 3PAR array that supports DIF/DIX with upstream code. An error is only seen when IOs are done on DM devices, no error observed if IO is done on /dev/sdX. I added some prints to understand the problem and figured out that SCSI_PROT_IP_CHECKSUM flag is not set in scmnd->prot_flags. Ideally it should be set as BIP_IP_CHECKSUM should be set. --------------------<START: IO to /dev/sdc>---------------- [Mon Feb 6 17:54:56 2023] SK: bio_integrity_prep setting IP_CHECKSUM bio=ffff976f8d19c300 bip_flags=0x11 [Mon Feb 6 17:54:56 2023] SK: sd_setup_protect_cmnd setting IP_CHECKSUM bio=ffff976f8d19c300 bip_flags=0x11 [Mon Feb 6 17:54:56 2023] SK: bio_integrity_prep setting IP_CHECKSUM bio=ffff976f8d19c300 bip_flags=0x11 [Mon Feb 6 17:54:56 2023] SK: sd_setup_protect_cmnd setting IP_CHECKSUM bio=ffff976f8d19c300 bip_flags=0x11 -------------------<END: IO to /dev/sdc>----------------- ----------------<START: IO to dm-10>--------------------- [Mon Feb 6 17:55:13 2023] SK: bio_integrity_prep setting IP_CHECKSUM bio=ffff976f8d19c300 bip_flags=0x11 [Mon Feb 6 17:55:13 2023] SK: sd_setup_protect_cmnd else IP_CHECKSUM bio=ffff976fa15fa490 bip_flags=0x0 [Mon Feb 6 17:55:13 2023] dm-10: guard tag error at sector 0 (rcvd 0000, want ffff) [Mon Feb 6 17:55:13 2023] SK: bio_integrity_prep setting IP_CHECKSUM bio=ffff978f0752c180 bip_flags=0x11 [Mon Feb 6 17:55:13 2023] SK: sd_setup_protect_cmnd else IP_CHECKSUM bio=ffff976fc87fef10 bip_flags=0x0 [Mon Feb 6 17:55:13 2023] dm-10: guard tag error at sector 0 (rcvd 0000, want ffff) [Mon Feb 6 17:55:13 2023] Buffer I/O error on dev dm-10, logical block 0, async page read -----------------<END: IO to dm-10>------------------------ Its noticed that bio pointer get changed when IO is done through dm device. I added more debug prints in bio_clone and bio_integrity_clone and concluded that bip_flags are not getting copied in bio_integrity_clone routine. -------------------- [Tue Feb 7 14:15:47 2023] SK: bio_integrity_prep setting IP_CHECKSUM bio=ffff891ecc5fa840 bip_flags=0x11 [Tue Feb 7 14:15:47 2023] SK: __bio_clone: bio=ffff891ed97b5990 bio_src=ffff891ecc5fa840 [Tue Feb 7 14:15:47 2023] SK: bio_integrity_clone: bip=ffff891ecc5fd500 bip_src=ffff891ecc5fcb40 bip_flags=0x0 src_bip_flags=0x11 [Tue Feb 7 14:15:47 2023] SK: sd_setup_protect_cmnd else IP_CHECKSUM bio=ffff891ed97b5990 bip_flags=0x0 [Tue Feb 7 14:15:47 2023] dm-3: guard tag error at sector 0 (rcvd 0000, want ffff) [Tue Feb 7 14:15:47 2023] Buffer I/O error on dev dm-3, logical block 0, async page read ---------------------------------- If I add the change to copy the flags, following BUG_ON in slub.c is reported ------------------<code>------------- diff --git a/block/bio-integrity.c b/block/bio-integrity.c index 3f5685c00e36..07e7443c7be3 100644 --- a/block/bio-integrity.c +++ b/block/bio-integrity.c @@ -418,6 +418,7 @@ int bio_integrity_clone(struct bio *bio, struct bio *bio_src, bip->bip_vcnt = bip_src->bip_vcnt; bip->bip_iter = bip_src->bip_iter; + bip->bip_flags = bip_src->bip_flags; return 0; } ----------------<code>--------------- ------------------<BUG_ON>-------------- [ 751.838432] kernel BUG at mm/slub.c:435! [ 751.838440] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 751.838443] CPU: 49 PID: 981 Comm: kworker/49:1H Kdump: loaded Not tainted 6.2.0-rc1+ #14 [ 751.838447] Hardware name: Dell Inc. PowerEdge R7525/0590KW, BIOS 2.5.6 10/06/2021 [ 751.838448] Workqueue: kintegrityd bio_integrity_verify_fn [ 751.838458] RIP: 0010:__slab_free+0x1ae/0x300 [ 751.838467] Code: 4c 89 e6 48 89 ef 5d 41 5c 41 5d 41 5e 41 5f e9 d8 fb ff ff 48 83 c4 60 4c 89 f7 5b 5d 41 5c 41 5d 41 5e 41 5f e9 62 3b 00 00 <0f> 0b 80 4c 24 4b 80 e9 ea fe ff ff 4c 89 fa 4d 89 d7 4c 8b 54 24 [ 751.838469] RSP: 0018:ffffbb674fcf7dd0 EFLAGS: 00010246 [ 751.838472] RAX: ffff9c320d3546e0 RBX: ffff9c325302e480 RCX: 000000008040003f [ 751.838473] RDX: ffffffc10e1546c0 RSI: ffffdfb30434d500 RDI: ffff9c3200042500 [ 751.838475] RBP: ffff9c3200042500 R08: 0000000000000001 R09: ffffffffb4fbf08a [ 751.838476] R10: ffffbb674fcf7ca0 R11: ffffffffb65e4ac8 R12: ffffdfb30434d500 [ 751.838477] R13: ffff9c320d3546c0 R14: ffff9c320d3546c0 R15: ffff9c320d3546c0 [ 751.838479] FS: 0000000000000000(0000) GS:ffff9c70ff840000(0000) knlGS:0000000000000000 [ 751.838481] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 751.838482] CR2: 00007fe84efedb00 CR3: 000000015472a000 CR4: 0000000000350ee0 [ 751.838484] Call Trace: [ 751.838485] <TASK> [ 751.838487] ? bio_integrity_process+0x14f/0x1c0 [ 751.838494] ? __pfx_t10_pi_type1_verify_ip+0x10/0x10 [t10_pi] [ 751.838501] bio_integrity_free+0xaa/0xb0 [ 751.838504] bio_integrity_verify_fn+0x40/0x50 [ 751.838507] process_one_work+0x1e5/0x3b0 [ 751.838513] ? __pfx_worker_thread+0x10/0x10 [ 751.838515] worker_thread+0x50/0x3a0 [ 751.838518] ? __pfx_worker_thread+0x10/0x10 [ 751.838520] kthread+0xd9/0x100 [ 751.838525] ? __pfx_kthread+0x10/0x10 [ 751.838528] ret_from_fork+0x2c/0x50 [ 751.838535] </TASK> ----------------------<BUG_ON>--------------- Queries 1) Is there a specific reason for not copying the bip_flags in bio_integrity_clone function? 2) If bip_flags needs to be copied then is there something else needs to be done that will take care of BUG_ON? 3) if not, then what should be right solution for fix an IO error because of SCSI_PROT_IP_CHECKSUM flag not set. Thanks, ~Saurav