On Fri, Mar 27, 2020 at 11:04 AM 陈安庆 <chenanqing@xxxxxxxx> wrote: > > > On Fri, Mar 27, 2020 at 9:36 AM <chenanqing@xxxxxxxx> wrote: > > > > From: Chen Anqing <chenanqing@xxxxxxxx> > > To: Ilya Dryomov <idryomov@xxxxxxxxx> > > Cc: Jeff Layton <jlayton@xxxxxxxxxx>, > > Sage Weil <sage@xxxxxxxxxx>, > > Jakub Kicinski <kuba@xxxxxxxxxx>, > > ceph-devel@xxxxxxxxxxxxxxx, > > netdev@xxxxxxxxxxxxxxx, > > linux-kernel@xxxxxxxxxxxxxxx, > > chenanqing@xxxxxxxx > > Subject: [PATCH] libceph: we should take compound page into account > > also > > Date: Fri, 27 Mar 2020 04:36:30 -0400 > > Message-Id: <20200327083630.36296-1-chenanqing@xxxxxxxx> > > X-Mailer: git-send-email 2.18.2 > > > > the patch is occur at a real crash,which slab is come from a compound > > page,so we need take the compound page into account also. > > fixed commit 7e241f647dc7 ("libceph: fall back to sendmsg for slab pages")' > > > > Signed-off-by: Chen Anqing <chenanqing@xxxxxxxx> > > --- > > net/ceph/messenger.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index > > f8ca5edc5f2c..e08c1c334cd9 100644 > > --- a/net/ceph/messenger.c > > +++ b/net/ceph/messenger.c > > @@ -582,7 +582,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page, > > * coalescing neighboring slab objects into a single frag which > > * triggers one of hardened usercopy checks. > > */ > > - if (page_count(page) >= 1 && !PageSlab(page)) > > + if (page_count(page) >= 1 && !PageSlab(compound_head(page))) > > sendpage = sock->ops->sendpage; > > else > > sendpage = sock_no_sendpage; > > >Hi Chen, > > >AFAICT compound pages should already be taken into account, because PageSlab is defined as: > > > > __PAGEFLAG(Slab, slab, PF_NO_TAIL) > > > > #define __PAGEFLAG(uname, lname, policy) \ > > TESTPAGEFLAG(uname, lname, policy) \ > > __SETPAGEFLAG(uname, lname, policy) \ > > __CLEARPAGEFLAG(uname, lname, policy) > > > > #define TESTPAGEFLAG(uname, lname, policy) \ > > static __always_inline int Page##uname(struct page *page) \ > > { return test_bit(PG_##lname, &policy(page, 0)->flags); } > > > > and PF_NO_TAIL policy is defined as: > > > #define PF_NO_TAIL(page, enforce) ({ \ > > VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page); \ > > PF_POISONED_CHECK(compound_head(page)); }) > > > So compound_head() is called behind the scenes. > > >Could you please explain what crash did you observe in more detail? > >Perhaps you backported this patch to an older kernel? > > >Thanks, > > > Ilya > Hi llya, > thank you for you reply so quickly. > I have apply the patch in my server ,it's work fine ,so I thought it should be pushed to the community, but I Know nothing about the > PageSlab has been changed ,because I use the 3.10.0- all the time ,sorry for that. i also send patch to scsi group also. No problem at all, pushing fixes upstream is a good thing! So yes, if you are backporting to 3.10 on your own, you need this patch. FWIW libceph has been patched in kernel-3.10.0-957.10.1.el7. Can't say anything about tgtd though. > > my crash is writed below: > > [85774.558604] usercopy: kernel memory exposure attempt detected from ffff9cba0bf75400 (kmalloc-512) (1024 bytes) > [85774.559261] ------------[ cut here ]------------ > [85774.559839] kernel BUG at mm/usercopy.c:72! > [85774.560367] invalid opcode: 0000 [#1] SMP > [85774.560879] Modules linked in: cmac arc4 md4 nls_utf8 cifs ccm dns_resolver xfs iscsi_tcp libiscsi_tcp libiscsi iptable_raw iptable_mangle sch_sfq sch_htb scsi_transport_iscsi veth ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_comment xt_mark iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c loop bonding fuse sunrpc dm_mirror dm_region_hash dm_log dm_mod dell_smbios dell_wmi_descriptor iTCO_wdt iTCO_vendor_support dcdbas skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_ssif sg pcspkr mei_me lpc_ich i2c_i801 mei wmi ipmi_si > [85774.564343] ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm megaraid_sas drm ixgbe ahci igb drm_panel_orientation_quirks libahci mdio libata ptp pps_core dca i2c_algo_bit nfit libnvdimm > [85774.566809] CPU: 9 PID: 28054 Comm: tgtd Kdump: loaded Not tainted 3.10.0-957.27.2.el7.x86_64 #1 > [85774.567446] Hardware name: Dell Inc. PowerEdge R740/0YNX56, BIOS 2.4.8 11/26/2019 > [85774.568094] task: ffff9cb12e1e0000 ti: ffff9cb124224000 task.ti: ffff9cb124224000 > [85774.568754] RIP: 0010:[<ffffffff9803f557>] [<ffffffff9803f557>] __check_object_size+0x87/0x250 > [85774.569419] RSP: 0018:ffff9cb124227b98 EFLAGS: 00010246 > [85774.570072] RAX: 0000000000000062 RBX: ffff9cba0bf75400 RCX: 0000000000000000 > [85774.570723] RDX: 0000000000000000 RSI: ffff9cc13bf13898 RDI: ffff9cc13bf13898 > [85774.571372] RBP: ffff9cb124227bb8 R08: 0000000000000000 R09: ffff9cb1313e6f00 > [85774.572017] R10: 000000000003bc95 R11: 0000000000000001 R12: 0000000000000400 > [85774.572669] R13: 0000000000000001 R14: ffff9cba0bf75800 R15: 0000000000000400 > [85774.573325] FS: 00007f41a122a740(0000) GS:ffff9cc13bf00000(0000) knlGS:0000000000000000 > [85774.573994] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [85774.574655] CR2: 0000000003236fe0 CR3: 0000001023138000 CR4: 00000000007607e0 > [85774.575314] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [85774.575964] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [85774.576609] PKRU: 55555554 > [85774.577242] Call Trace: > [85774.577880] [<ffffffff9818dd9d>] memcpy_toiovec+0x4d/0xb0 > [85774.578531] [<ffffffff9842c858>] skb_copy_datagram_iovec+0x128/0x280 > [85774.579190] [<ffffffff9849372a>] tcp_recvmsg+0x22a/0xb30 > [85774.579838] [<ffffffff984c2340>] inet_recvmsg+0x80/0xb0 > [85774.580474] [<ffffffff9841a6ec>] sock_aio_read.part.9+0x14c/0x170 > [85774.581097] [<ffffffff9841a731>] sock_aio_read+0x21/0x30 > [85774.581714] [<ffffffff98041b33>] do_sync_read+0x93/0xe0 > [85774.582328] [<ffffffff98042615>] vfs_read+0x145/0x170 > [85774.582934] [<ffffffff9804342f>] SyS_read+0x7f/0xf0 > [85774.583543] [<ffffffff98576ddb>] system_call_fastpath+0x22/0x27 > [85774.584158] Code: 45 d1 48 c7 c6 05 c3 87 98 48 c7 c1 f6 57 88 98 48 0f 45 f1 49 89 c0 4d 89 e1 48 89 d9 48 c7 c7 00 27 88 98 31 c0 e8 30 e4 51 00 <0f> 0b 0f 1f 80 00 00 00 00 48 c7 c0 00 00 e0 97 4c 39 f0 73 0d > [85774.585495] RIP [<ffffffff9803f557>] __check_object_size+0x87/0x250 > [85774.586135] RSP <ffff9cb124227b98> > > crash> dis -l skb_copy_datagram_iovec > /usr/src/debug/kernel-3.10.0-957.27.2.el7/linux-3.10.0-957.27.2.el7.x86_64/net/core/datagram.c: 395 > 0xffffffff9842c730 <skb_copy_datagram_iovec>: nopl 0x0(%rax,%rax,1) [FTRACE NOP] > 0xffffffff9842c735 <skb_copy_datagram_iovec+5>: push %rbp > 0xffffffff9842c736 <skb_copy_datagram_iovec+6>: mov %rsp,%rbp > 0xffffffff9842c739 <skb_copy_datagram_iovec+9>: push %r15 > 0xffffffff9842c73b <skb_copy_datagram_iovec+11>: push %r14 > 0xffffffff9842c73d <skb_copy_datagram_iovec+13>: mov %rdi,%r14 > 0xffffffff9842c740 <skb_copy_datagram_iovec+16>: push %r13 > 0xffffffff9842c742 <skb_copy_datagram_iovec+18>: push %r12 > 0xffffffff9842c744 <skb_copy_datagram_iovec+20>: mov %esi,%r12d > 0xffffffff9842c747 <skb_copy_datagram_iovec+23>: push %rbx----------------:skb is store in rbx and I get it from stack > 0xffffffff9842c748 <skb_copy_datagram_iovec+24>: mov %ecx,%ebx > 0xffffffff9842c74a <skb_copy_datagram_iovec+26>: sub $0x28,%rsp > > crash> sk_buff.len ffff9cb0e3b388f8 > len = 1024 > > crash> sk_buff.data_len ffff9cb0e3b388f8 > data_len = 1024 > > crash> sk_buff.head ffff9cb0e3b388f8 > head = 0xffff9cbf9c679400 "" > > crash> sk_buff.end ffff9cb0e3b388f8 -x > end = 0x2c0 > crash> px 0xffff9cbf9c679400 + 0x2c0 > $5 = 0xffff9cbf9c6796c0 > > crash> skb_shared_info 0xffff9cbf9c6796c0 > struct skb_shared_info { > nr_frags = 1 '\001', > tx_flags = 32 ' ', > gso_size = 0, > gso_segs = 1, > gso_type = 0, > frag_list = 0x0, > hwtstamps = { > > > crash> skb_shared_info.frags 0xffff9cbf9c6796c0 > frags = {{ > page = { > p = 0xfffff597a42fdd40-----it the page which store the data > }, > page_offset = 1024, > size = 1024 > }, { > > crash> kmem ffff9cba0bf75400------------the address which is reported in the bugon line, > CACHE NAME OBJSIZE ALLOCATED TOTAL SLABS SSIZE > ffff9ca27fc07600 kmalloc-512 512 242454 464512 7258 32k > SLAB MEMORY NODE TOTAL ALLOCATED FREE > fffff597a42fdc00 ffff9cba0bf70000 1 64 59 5 > FREE / [ALLOCATED] > [ffff9cba0bf75400] > > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > fffff597a42fdd40 190bf75000 0 0 0 6fffff00008000 tail > > and I found the page is not the head page ,it is a tail page of compound page. > > so I search the linux code ,I get the patch: > commit 7e241f647dc7 ("libceph: fall back to sendmsg for slab pages")' > commit 08b11eaccfcf ("scsi: libiscsi: fall back to sendmsg for slab pages"). Thanks, Ilya