On Tue, Mar 29, 2016 at 12:57 PM, Ming Lei <tom.leiming@xxxxxxxxx> wrote: > Hi Naveen, > > Thanks for reporting the issue. > > On Tue, Mar 29, 2016 at 1:21 AM, Naveen Kaje <nkaje@xxxxxxxxxxxxxx> wrote: >> Hello, >> I am seeing a crash with Kernel 4.3 and 4.5 based builds on QDF2432. A >> similar crash is also being reported on Ubuntu Launchpad. >> https://bugs.launchpad.net/ubuntu/+bug/1546439 >> Crash Details from QDF2432 >> https://bugs.launchpad.net/ubuntu/+bug/1546439/comments/11. >> The earlier comments on this issue are reported from different platforms. >> The issue reproduces with effort fairly consistently when a disk is >> repeatedly formatted with Parted. A script to reproduce this crash is >> attached. >> >> The crash is due to dereferencing an invalid pointer in blk_account_io_start >> -> hd_struct_try_get -> percpu_ref_tryget_live, which gets an invalid >> percpu_count. >> >> When I reverted this patch, >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/include/linux/genhd.h?id=6c71013ecb7e2bddbed9f5b95e7aed22c491daa9 >> the crash did not reproduce (on both 4.3 and 4.5 based builds). >> From the data gathered, I suspect corruption of percpu_count. > > Even the commit 6c71013ecb7e2(block: partition: convert percpu ref) is > reverted, I still can see the following crash[1] with your test > script. Hi Naveen, Looks the disk device leak only exists in -next kernel, and no such issue with linus tree. So could you test the patch attached in my last email to see if it can fix your issue? Thanks, > > But there is one issue with that commit: initialization of percpu_ref > of partition > should have been done before sending out KOBJ_ADD, which may cause > userspace to read partition table. The attached patch should fix this issue. > You may try this patch to see if it can make any difference in your test. > >> >> Is this issue reported elsewhere? and do we know a fix? Please let me know >> how I can help further to fix this crash. >> >> Thanks, >> Naveen >> > > [1] crash log with 6c71013ecb7e2 reverted > > ming@ming:~/git/vm-test$ sudo ./parted.sh /dev/nvme0n1 > [ 65.032376] ------------[ cut here ]------------ > [ 65.033231] kernel BUG at mm/percpu.c:692! > [ 65.033637] invalid opcode: 0000 [#1] PREEMPT SMP > [ 65.034144] Dumping ftrace buffer: > [ 65.034496] (ftrace buffer empty) > [ 65.034851] Modules linked in: xt_conntrack ipt_REJECT > nf_reject_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables > xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 > iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat > nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables > x_tables iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nbd > btrfs xor raid6_pq psmouse multipath bcache ahci libahci libata nvme > nvme_core nd_pmem nd_btt serio_raw 8250_fintek null_blk configs > autofs4 > [ 65.036006] CPU: 0 PID: 2614 Comm: parted Tainted: G W > 4.6.0-rc1-next-20160327+ #1855 > [ 65.036006] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), > BIOS rel-1.9.0-0-g01a84be-prebuilt.qemu-project.org 04/01/2014 > [ 65.036006] task: ffff8800799c0c40 ti: ffff88026b744000 task.ti: > ffff88026b744000 > [ 65.036006] RIP: 0010:[<ffffffff8117cf3b>] [<ffffffff8117cf3b>] > pcpu_free_area+0x19b/0x1a0 > [ 65.036006] RSP: 0018:ffff88026b747cc8 EFLAGS: 00010097 > [ 65.036006] RAX: 0000000000000799 RBX: 0000000000000799 RCX: 0000000000000799 > [ 65.036006] RDX: 0000000000000010 RSI: 0000000000009641 RDI: ffffc900000e4000 > [ 65.036006] RBP: ffff88026b747cf8 R08: 0000000000009640 R09: 0000000000000ff0 > [ 65.036006] R10: ffff88027fc18a20 R11: ffff88027aa88540 R12: 0000000000000012 > [ 65.036006] R13: ffff88027b2e9a00 R14: ffff88026b747d0c R15: 0000000000080000 > [ 65.036006] FS: 00007f6c04f9b840(0000) GS:ffff88027fc00000(0000) > knlGS:0000000000000000 > [ 65.036006] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 65.036006] CR2: 000000000178b600 CR3: 000000026b657000 CR4: 00000000000006f0 > [ 65.036006] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 65.036006] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 65.036006] Stack: > [ 65.036006] 0000001081084473 ffffe8ffffc09640 ffff88027b2e9a00 > 0000000000000282 > [ 65.036006] ffff88027ffc7e40 0000000000080000 ffff88026b747d38 > ffffffff8117d486 > [ 65.036006] ffff880279490800 ffff880279490870 ffff880279490800 > ffff8802788e7a80 > [ 65.036006] Call Trace: > [ 65.036006] [<ffffffff8117d486>] free_percpu+0x96/0x180 > [ 65.036006] [<ffffffff8138df96>] disk_release+0x76/0xd0 > [ 65.036006] [<ffffffff8149f722>] device_release+0x32/0xa0 > [ 65.036006] [<ffffffff813aa1d7>] kobject_cleanup+0x77/0x190 > [ 65.036006] [<ffffffff813aa085>] kobject_put+0x25/0x50 > [ 65.036006] [<ffffffff8138cc37>] put_disk+0x17/0x20 > [ 65.036006] [<ffffffff8120bc6b>] __blkdev_put+0x1eb/0x2b0 > [ 65.036006] [<ffffffff8120c57e>] blkdev_put+0x4e/0x120 > [ 65.036006] [<ffffffff8120c705>] blkdev_close+0x25/0x30 > [ 65.036006] [<ffffffff811d3486>] __fput+0xd6/0x210 > [ 65.036006] [<ffffffff811d35fe>] ____fput+0xe/0x10 > [ 65.036006] [<ffffffff8107d487>] task_work_run+0x77/0x90 > [ 65.036006] [<ffffffff8105c271>] exit_to_usermode_loop+0x73/0x98 > [ 65.036006] [<ffffffff8100295d>] syscall_return_slowpath+0x3d/0x50 > [ 65.036006] [<ffffffff8169a2ba>] entry_SYSCALL_64_fastpath+0xa2/0xa4 > [ 65.036006] Code: e0 41 8d 44 24 fe 89 c2 85 c0 b8 01 00 00 00 0f > 4f c2 89 45 d4 e9 ab fe ff ff 8b 05 10 cb b7 00 83 e8 01 89 45 d4 e9 > 9a fe ff ff <0f> 0b 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 56 49 89 d6 > 41 55 > [ 65.036006] RIP [<ffffffff8117cf3b>] pcpu_free_area+0x19b/0x1a0 > [ 65.036006] RSP <ffff88026b747cc8> > [ 65.036006] ---[ end trace 31dba1d1f5b04248 ]--- > > -- > Ming Lei -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html