BUG in copy_page_to_iter() when iscsi sets ENABLE_CLUSTERING

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I recently found what I believe is a bug, and I'd appreciate feedback
on if that is correct, and if so how to proceed.

BACKGROUND

Recently Christoph Hellwig sent an email to driver maintainers for
drivers that set ".use_clustering" to DISABLE_CLUSTERING in their SCSI
Host templates, asking if the setting could be changed to
ENABLE_CLUSTERING.

As part of answering that question, I set ENABLE_CLUSTERING in
drivers/scsi/iscsi_tcp.c and tested both the iscsi initiator and
target.

As a reminder, setting ENABLE_CLUSTERING means that adjacent bio-s can
be merged. This can make IO faster, but it means that drivers must be
able to deal with IOs that cross page boundaries, since bio merges can
create such IOs.

RESULTS

The iscsi initiator code can handle ENABLE_CLUSTERING just fine, but
the iscsi target code fails. It seems to assume that IOs do *NOT*
cross a page boundary.

The problem is in iscsi lib/iov_iter.c, in the functions
copy_page_to_iter() and page_copy_sane() (see below for how to
reproduce):

>> static inline bool page_copy_sane(struct page *page, size_t offset, size_t n)
>> {
>>         struct page *head = compound_head(page);
>>         size_t v = n + offset + page_address(page) - page_address(head);
>> 
>>         if (likely(n <= v && v <= (PAGE_SIZE << compound_order(head))))
>>                 return true;
>>         WARN_ON(1);
>>         return false;
>> }
>> 
>> size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
>>                          struct iov_iter *i)
>> {
>>         if (unlikely(!page_copy_sane(page, offset, bytes)))
>>                 return 0;
>>         if (i->type & (ITER_BVEC|ITER_KVEC)) {
>>                 void *kaddr = kmap_atomic(page);
>>                 size_t wanted = copy_to_iter(kaddr + offset, bytes, i);
>>                 kunmap_atomic(kaddr);
>>                 return wanted;
>>         } else if (unlikely(iov_iter_is_discard(i)))
>>                 return bytes;
>>         else if (likely(!iov_iter_is_pipe(i)))
>>                 return copy_page_to_iter_iovec(page, offset, bytes, i);
>>         else
>>                 return copy_page_to_iter_pipe(page, offset, bytes, i);
>> }

Causing the following WARN_ON stack trace (repeatedly):

>> ...
>> [   78.644559] WARNING: CPU: 0 PID: 2192 at lib/iov_iter.c:830 copy_page_to_iter+0x1a6/0x2e0
>> [   78.644561] Modules linked in: iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) rfcomm(E) iscsi_target_mod(E) target_core_pscsi(E) target_core_file(E) target_core_iblock(E) target_core_user(E) uio(E) target_core_mod(E) configfs(E) af_packet(E) iscsi_ibft(E) iscsi_boot_sysfs(E) vmw_vsock_vmci_transport(E) vsock(E) bnep(E) fuse(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) xfs(E) aesni_intel(E) snd_ens1371(E) aes_x86_64(E) snd_ac97_codec(E) crypto_simd(E) cryptd(E) ac97_bus(E) glue_helper(E) snd_rawmidi(E) vmw_balloon(E) snd_seq_device(E) snd_pcm(E) pcspkr(E) snd_timer(E) snd(E) uvcvideo(E) btusb(E) videobuf2_vmalloc(E) btrtl(E) videobuf2_memops(E) videobuf2_v4l2(E) btbcm(E) btintel(E) videodev(E) bluetooth(E) videobuf2_common(E) vmw_vmci(E) ecdh_generic(E) rfkill(E) soundcore(E) mptctl(E) gameport(E) joydev(E) i2c_piix4(E) e1000(E) ac(E) button(E) btrfs(E) libcrc32c(E) xor(E) raid6_pq(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) ata_generic(E)
>> [   78.644583]  crc32c_intel(E) serio_raw(E) mptspi(E) scsi_transport_spi(E) mptscsih(E) ata_piix(E) uhci_hcd(E) ehci_pci(E) ehci_hcd(E) vmwgfx(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) usbcore(E) ttm(E) mptbase(E) drm(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E)
>> [   78.644593] CPU: 0 PID: 2192 Comm: iscsi_trx Tainted: G            E     4.20.0-rc4-1-default+ #1 openSUSE Tumbleweed (unreleased)
>> [   78.644593] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
>> [   78.644595] RIP: 0010:copy_page_to_iter+0x1a6/0x2e0
>> [   78.644596] Code: 0c 24 48 98 49 89 ce 48 89 c2 49 29 c6 48 29 ca 4d 01 f4 48 01 d5 48 85 c0 0f 85 71 ff ff ff 48 85 ed 0f 84 68 ff ff ff eb b2 <0f> 0b 45 31 ed eb 8d bf 01 00 00 00 e8 d9 1e c5 ff 65 4c 8b 34 25
>> [   78.644596] RSP: 0018:ffffa04b41eefbe0 EFLAGS: 00010206
>> [   78.644597] RAX: fffff7acc40de180 RBX: ffffa04b41eefd80 RCX: 002ffff80000a014
>> [   78.644598] RDX: 0000000000002000 RSI: 0000000000001000 RDI: fffff7acc0000000
>> [   78.644598] RBP: fffff7acc40de180 R08: ffff8f1703786000 R09: 0000000000002000
>> [   78.644599] R10: 00000000000002c0 R11: ffff8f16b3f31400 R12: 0000000000000000
>> [   78.644600] R13: 0000000000002000 R14: 0000000000000008 R15: 0000000000003800
>> [   78.644601] FS:  0000000000000000(0000) GS:ffff8f1739c00000(0000) knlGS:0000000000000000
>> [   78.644601] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   78.644602] CR2: 00007f0f54772000 CR3: 000000012e670004 CR4: 00000000003606f0
>> [   78.644624] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [   78.644625] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [   78.644625] Call Trace:
>> [   78.644631]  skb_copy_datagram_iter+0x16f/0x270
>> [   78.644633]  tcp_recvmsg+0x223/0xc30
>> [   78.644637]  inet_recvmsg+0x4b/0xe0
>> [   78.644644]  iscsit_do_rx_data.constprop.9+0x89/0x110 [iscsi_target_mod]
>> [   78.644650]  rx_data+0x58/0x70 [iscsi_target_mod]
>> [   78.644654]  iscsit_get_rx_pdu+0xa7b/0xd20 [iscsi_target_mod]
>> [   78.644657]  ? preempt_count_sub+0x43/0x50
>> [   78.644659]  ? _raw_spin_unlock_irq+0x22/0x40
>> [   78.644663]  ? iscsi_target_tx_thread+0x1d0/0x1d0 [iscsi_target_mod]
>> [   78.644667]  ? iscsi_target_rx_thread+0x71/0xd0 [iscsi_target_mod]
>> [   78.644670]  iscsi_target_rx_thread+0x71/0xd0 [iscsi_target_mod]
>> [   78.644672]  kthread+0x116/0x130
>> [   78.644673]  ? kthread_create_worker_on_cpu+0x40/0x40
>> [   78.644675]  ret_from_fork+0x24/0x50
>> [   78.644678] ---[ end trace a0d6d5ab0e8625ee ]---
>> [   78.644725]  connection1:0: detected conn error (1020)
>> ...
>> [  108.859031]  connection1:0: detected conn error (1020)
>> [  108.859104] sd 3:0:0:0: [sdc] tag#0 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
>> [  108.859106] sd 3:0:0:0: [sdc] tag#0 CDB: Write(10) 2a 00 00 00 11 a2 00 00 50 00
>> [  108.859107] print_req_error: I/O error, dev sdc, sector 4514
>> ...


TO REPRODUCE:

I used openSUSE Tumbleweed using a 4.20 kernel, but any recent kernel
should display this issue.

I used the same system for both the iSCSI initiator (using open-iscsi)
and the iscsi target (using targetcli-fb). [I believe you _could_ use
a different system for the initiator, but I did not test this.]

I set up a file-based iSCSI target of size 5G using the targetcli
shell, with an ACL allowing the local iscsi initiator to connect. The
backing store file's contents do not matter, but I initialized my file
to all zeros.

I then connect from the same system using "iscsiadm -m discovery
$HOST" to discover the new target, then "iscsiadm -m node -l" to login
to the discovered target. [Any connection strategy should work,
though.] This should create a new disc device. In my case, it was
"/dev/sdc", since I already have two hard discs.

I then do writes to the new device using "dd". There must be enough
writes, so I wrote 100 64k blocks of zeros:

> dd if=/dev/zero of=/dev/sdc bs=64k count=100

Observe "dmesg --follow" output from the kernel to see the errors and
stack traces.
--
Lee Duncan



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux