I recently found what I believe is a bug, and I'd appreciate feedback on if that is correct, and if so how to proceed. BACKGROUND Recently Christoph Hellwig sent an email to driver maintainers for drivers that set ".use_clustering" to DISABLE_CLUSTERING in their SCSI Host templates, asking if the setting could be changed to ENABLE_CLUSTERING. As part of answering that question, I set ENABLE_CLUSTERING in drivers/scsi/iscsi_tcp.c and tested both the iscsi initiator and target. As a reminder, setting ENABLE_CLUSTERING means that adjacent bio-s can be merged. This can make IO faster, but it means that drivers must be able to deal with IOs that cross page boundaries, since bio merges can create such IOs. RESULTS The iscsi initiator code can handle ENABLE_CLUSTERING just fine, but the iscsi target code fails. It seems to assume that IOs do *NOT* cross a page boundary. The problem is in iscsi lib/iov_iter.c, in the functions copy_page_to_iter() and page_copy_sane() (see below for how to reproduce): >> static inline bool page_copy_sane(struct page *page, size_t offset, size_t n) >> { >> struct page *head = compound_head(page); >> size_t v = n + offset + page_address(page) - page_address(head); >> >> if (likely(n <= v && v <= (PAGE_SIZE << compound_order(head)))) >> return true; >> WARN_ON(1); >> return false; >> } >> >> size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes, >> struct iov_iter *i) >> { >> if (unlikely(!page_copy_sane(page, offset, bytes))) >> return 0; >> if (i->type & (ITER_BVEC|ITER_KVEC)) { >> void *kaddr = kmap_atomic(page); >> size_t wanted = copy_to_iter(kaddr + offset, bytes, i); >> kunmap_atomic(kaddr); >> return wanted; >> } else if (unlikely(iov_iter_is_discard(i))) >> return bytes; >> else if (likely(!iov_iter_is_pipe(i))) >> return copy_page_to_iter_iovec(page, offset, bytes, i); >> else >> return copy_page_to_iter_pipe(page, offset, bytes, i); >> } Causing the following WARN_ON stack trace (repeatedly): >> ... >> [ 78.644559] WARNING: CPU: 0 PID: 2192 at lib/iov_iter.c:830 copy_page_to_iter+0x1a6/0x2e0 >> [ 78.644561] Modules linked in: iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) rfcomm(E) iscsi_target_mod(E) target_core_pscsi(E) target_core_file(E) target_core_iblock(E) target_core_user(E) uio(E) target_core_mod(E) configfs(E) af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) vmw_vsock_vmci_transport(E) vsock(E) bnep(E) fuse(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) xfs(E) aesni_intel(E) snd_ens1371(E) aes_x86_64(E) snd_ac97_codec(E) crypto_simd(E) cryptd(E) ac97_bus(E) glue_helper(E) snd_rawmidi(E) vmw_balloon(E) snd_seq_device(E) snd_pcm(E) pcspkr(E) snd_timer(E) snd(E) uvcvideo(E) btusb(E) videobuf2_vmalloc(E) btrtl(E) videobuf2_memops(E) videobuf2_v4l2(E) btbcm(E) btintel(E) videodev(E) bluetooth(E) videobuf2_common(E) vmw_vmci(E) ecdh_generic(E) rfkill(E) soundcore(E) mptctl(E) gameport(E) joydev(E) i2c_piix4(E) e1000(E) ac(E) button(E) btrfs(E) libcrc32c(E) xor(E) raid6_pq(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) ata_generic(E) >> [ 78.644583] crc32c_intel(E) serio_raw(E) mptspi(E) scsi_transport_spi(E) mptscsih(E) ata_piix(E) uhci_hcd(E) ehci_pci(E) ehci_hcd(E) vmwgfx(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) usbcore(E) ttm(E) mptbase(E) drm(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) >> [ 78.644593] CPU: 0 PID: 2192 Comm: iscsi_trx Tainted: G E 4.20.0-rc4-1-default+ #1 openSUSE Tumbleweed (unreleased) >> [ 78.644593] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 >> [ 78.644595] RIP: 0010:copy_page_to_iter+0x1a6/0x2e0 >> [ 78.644596] Code: 0c 24 48 98 49 89 ce 48 89 c2 49 29 c6 48 29 ca 4d 01 f4 48 01 d5 48 85 c0 0f 85 71 ff ff ff 48 85 ed 0f 84 68 ff ff ff eb b2 <0f> 0b 45 31 ed eb 8d bf 01 00 00 00 e8 d9 1e c5 ff 65 4c 8b 34 25 >> [ 78.644596] RSP: 0018:ffffa04b41eefbe0 EFLAGS: 00010206 >> [ 78.644597] RAX: fffff7acc40de180 RBX: ffffa04b41eefd80 RCX: 002ffff80000a014 >> [ 78.644598] RDX: 0000000000002000 RSI: 0000000000001000 RDI: fffff7acc0000000 >> [ 78.644598] RBP: fffff7acc40de180 R08: ffff8f1703786000 R09: 0000000000002000 >> [ 78.644599] R10: 00000000000002c0 R11: ffff8f16b3f31400 R12: 0000000000000000 >> [ 78.644600] R13: 0000000000002000 R14: 0000000000000008 R15: 0000000000003800 >> [ 78.644601] FS: 0000000000000000(0000) GS:ffff8f1739c00000(0000) knlGS:0000000000000000 >> [ 78.644601] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 78.644602] CR2: 00007f0f54772000 CR3: 000000012e670004 CR4: 00000000003606f0 >> [ 78.644624] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [ 78.644625] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> [ 78.644625] Call Trace: >> [ 78.644631] skb_copy_datagram_iter+0x16f/0x270 >> [ 78.644633] tcp_recvmsg+0x223/0xc30 >> [ 78.644637] inet_recvmsg+0x4b/0xe0 >> [ 78.644644] iscsit_do_rx_data.constprop.9+0x89/0x110 [iscsi_target_mod] >> [ 78.644650] rx_data+0x58/0x70 [iscsi_target_mod] >> [ 78.644654] iscsit_get_rx_pdu+0xa7b/0xd20 [iscsi_target_mod] >> [ 78.644657] ? preempt_count_sub+0x43/0x50 >> [ 78.644659] ? _raw_spin_unlock_irq+0x22/0x40 >> [ 78.644663] ? iscsi_target_tx_thread+0x1d0/0x1d0 [iscsi_target_mod] >> [ 78.644667] ? iscsi_target_rx_thread+0x71/0xd0 [iscsi_target_mod] >> [ 78.644670] iscsi_target_rx_thread+0x71/0xd0 [iscsi_target_mod] >> [ 78.644672] kthread+0x116/0x130 >> [ 78.644673] ? kthread_create_worker_on_cpu+0x40/0x40 >> [ 78.644675] ret_from_fork+0x24/0x50 >> [ 78.644678] ---[ end trace a0d6d5ab0e8625ee ]--- >> [ 78.644725] connection1:0: detected conn error (1020) >> ... >> [ 108.859031] connection1:0: detected conn error (1020) >> [ 108.859104] sd 3:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK >> [ 108.859106] sd 3:0:0:0: [sdc] tag#0 CDB: Write(10) 2a 00 00 00 11 a2 00 00 50 00 >> [ 108.859107] print_req_error: I/O error, dev sdc, sector 4514 >> ... TO REPRODUCE: I used openSUSE Tumbleweed using a 4.20 kernel, but any recent kernel should display this issue. I used the same system for both the iSCSI initiator (using open-iscsi) and the iscsi target (using targetcli-fb). [I believe you _could_ use a different system for the initiator, but I did not test this.] I set up a file-based iSCSI target of size 5G using the targetcli shell, with an ACL allowing the local iscsi initiator to connect. The backing store file's contents do not matter, but I initialized my file to all zeros. I then connect from the same system using "iscsiadm -m discovery $HOST" to discover the new target, then "iscsiadm -m node -l" to login to the discovered target. [Any connection strategy should work, though.] This should create a new disc device. In my case, it was "/dev/sdc", since I already have two hard discs. I then do writes to the new device using "dd". There must be enough writes, so I wrote 100 64k blocks of zeros: > dd if=/dev/zero of=/dev/sdc bs=64k count=100 Observe "dmesg --follow" output from the kernel to see the errors and stack traces. -- Lee Duncan