Re: Bug Report: Crash due to invalid virtual address when running Parted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ming,
Thanks for the patch. With this patch I don't see the crash. I ran the script for over 5000 iterations. It seems to me that your fix is still relevant to Linus tree as it is fixing the way KOBJ is being initialized and avoiding a race condition. Please feel free to use my results. I can provide more information if needed.

Thanks,
Naveen

On 3/29/2016 2:51 AM, Ming Lei wrote:
On Tue, Mar 29, 2016 at 12:57 PM, Ming Lei <tom.leiming@xxxxxxxxx> wrote:
Hi Naveen,

Thanks for reporting the issue.

On Tue, Mar 29, 2016 at 1:21 AM, Naveen Kaje <nkaje@xxxxxxxxxxxxxx> wrote:
Hello,
I am seeing a crash with Kernel 4.3 and 4.5 based builds on QDF2432. A
similar crash is also being reported on Ubuntu Launchpad.
https://bugs.launchpad.net/ubuntu/+bug/1546439
Crash Details from QDF2432
https://bugs.launchpad.net/ubuntu/+bug/1546439/comments/11.
The earlier comments on this issue are reported from different platforms.
The issue reproduces with effort fairly consistently when a disk is
repeatedly formatted with Parted. A script to reproduce this crash is
attached.

The crash is due to dereferencing an invalid pointer in blk_account_io_start
-> hd_struct_try_get -> percpu_ref_tryget_live, which gets an invalid
percpu_count.

When I reverted this patch,
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/include/linux/genhd.h?id=6c71013ecb7e2bddbed9f5b95e7aed22c491daa9
the crash did not reproduce (on both 4.3 and 4.5 based builds).
 From the data gathered, I suspect corruption of percpu_count.
Even the commit 6c71013ecb7e2(block: partition: convert percpu ref) is
reverted, I still can see the following crash[1] with your test
script.
Hi Naveen,

Looks the disk device leak only exists in -next kernel, and no such issue with
linus tree.

So could you test the patch attached in my last email to see if it can fix your
issue?

Thanks,

But there is one issue with that commit: initialization of percpu_ref
of partition
should have been done before sending out KOBJ_ADD, which may cause
userspace to read partition table. The attached patch should fix this issue.
You may try this patch to see if it can make any difference in your test.

Is this issue reported elsewhere? and do we know a fix? Please let me know
how I can help further to fix this crash.

Thanks,
Naveen

[1] crash log with 6c71013ecb7e2 reverted

ming@ming:~/git/vm-test$ sudo ./parted.sh /dev/nvme0n1
[   65.032376] ------------[ cut here ]------------
[   65.033231] kernel BUG at mm/percpu.c:692!
[   65.033637] invalid opcode: 0000 [#1] PREEMPT SMP
[   65.034144] Dumping ftrace buffer:
[   65.034496]    (ftrace buffer empty)
[   65.034851] Modules linked in: xt_conntrack ipt_REJECT
nf_reject_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables
xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables
x_tables iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nbd
btrfs xor raid6_pq psmouse multipath bcache ahci libahci libata nvme
nvme_core nd_pmem nd_btt serio_raw 8250_fintek null_blk configs
autofs4
[   65.036006] CPU: 0 PID: 2614 Comm: parted Tainted: G        W
4.6.0-rc1-next-20160327+ #1855
[   65.036006] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS rel-1.9.0-0-g01a84be-prebuilt.qemu-project.org 04/01/2014
[   65.036006] task: ffff8800799c0c40 ti: ffff88026b744000 task.ti:
ffff88026b744000
[   65.036006] RIP: 0010:[<ffffffff8117cf3b>]  [<ffffffff8117cf3b>]
pcpu_free_area+0x19b/0x1a0
[   65.036006] RSP: 0018:ffff88026b747cc8  EFLAGS: 00010097
[   65.036006] RAX: 0000000000000799 RBX: 0000000000000799 RCX: 0000000000000799
[   65.036006] RDX: 0000000000000010 RSI: 0000000000009641 RDI: ffffc900000e4000
[   65.036006] RBP: ffff88026b747cf8 R08: 0000000000009640 R09: 0000000000000ff0
[   65.036006] R10: ffff88027fc18a20 R11: ffff88027aa88540 R12: 0000000000000012
[   65.036006] R13: ffff88027b2e9a00 R14: ffff88026b747d0c R15: 0000000000080000
[   65.036006] FS:  00007f6c04f9b840(0000) GS:ffff88027fc00000(0000)
knlGS:0000000000000000
[   65.036006] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   65.036006] CR2: 000000000178b600 CR3: 000000026b657000 CR4: 00000000000006f0
[   65.036006] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   65.036006] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   65.036006] Stack:
[   65.036006]  0000001081084473 ffffe8ffffc09640 ffff88027b2e9a00
0000000000000282
[   65.036006]  ffff88027ffc7e40 0000000000080000 ffff88026b747d38
ffffffff8117d486
[   65.036006]  ffff880279490800 ffff880279490870 ffff880279490800
ffff8802788e7a80
[   65.036006] Call Trace:
[   65.036006]  [<ffffffff8117d486>] free_percpu+0x96/0x180
[   65.036006]  [<ffffffff8138df96>] disk_release+0x76/0xd0
[   65.036006]  [<ffffffff8149f722>] device_release+0x32/0xa0
[   65.036006]  [<ffffffff813aa1d7>] kobject_cleanup+0x77/0x190
[   65.036006]  [<ffffffff813aa085>] kobject_put+0x25/0x50
[   65.036006]  [<ffffffff8138cc37>] put_disk+0x17/0x20
[   65.036006]  [<ffffffff8120bc6b>] __blkdev_put+0x1eb/0x2b0
[   65.036006]  [<ffffffff8120c57e>] blkdev_put+0x4e/0x120
[   65.036006]  [<ffffffff8120c705>] blkdev_close+0x25/0x30
[   65.036006]  [<ffffffff811d3486>] __fput+0xd6/0x210
[   65.036006]  [<ffffffff811d35fe>] ____fput+0xe/0x10
[   65.036006]  [<ffffffff8107d487>] task_work_run+0x77/0x90
[   65.036006]  [<ffffffff8105c271>] exit_to_usermode_loop+0x73/0x98
[   65.036006]  [<ffffffff8100295d>] syscall_return_slowpath+0x3d/0x50
[   65.036006]  [<ffffffff8169a2ba>] entry_SYSCALL_64_fastpath+0xa2/0xa4
[   65.036006] Code: e0 41 8d 44 24 fe 89 c2 85 c0 b8 01 00 00 00 0f
4f c2 89 45 d4 e9 ab fe ff ff 8b 05 10 cb b7 00 83 e8 01 89 45 d4 e9
9a fe ff ff <0f> 0b 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 56 49 89 d6
41 55
[   65.036006] RIP  [<ffffffff8117cf3b>] pcpu_free_area+0x19b/0x1a0
[   65.036006]  RSP <ffff88026b747cc8>
[   65.036006] ---[ end trace 31dba1d1f5b04248 ]---

--
Ming Lei



--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux