On Wed, Jan 09, 2019 at 02:32:52AM +0100, Jiri Kosina wrote: > On Tue, 8 Jan 2019, Pandruvada, Srinivas wrote: > > > Hi Peter, > > > > We are observing some random crashes in the latest kernel in the v5.0- > > rc1 kernel ( < 5% of times). This is pointing to a line 196, which is > > added by commit "HID: core: store the collections as a basic tree". > > We can't reproduce by reverting this and dependent patches. > > > > > > [ 6.980413] BUG: unable to handle kernel paging request at > > 0000000100000019 > > [ 6.980441] #PF error: [normal kernel read fault] > > [ 6.980455] PGD 0 P4D 0 > > [ 6.980466] Oops: 0000 [#1] SMP PTI > > [ 6.980478] CPU: 7 PID: 288 Comm: systemd-udevd Not tainted 5.0.0- > > rc1-intel-next+ #1 > > [ 6.980498] Hardware name: Intel Corporation Kabylake Client > > platform/Kabylake R DDR4 RVP, BIOS KBLSE2R1.R00.X127.P00.1804200616 > > 04/20/2018 > > [ 6.980529] RIP: 0010:hid_parser_main+0xd2/0x300 [hid] > > [ 6.980544] Code: 5d 41 5e 41 5f 5d c3 8b 83 e8 80 01 00 85 c0 0f 84 > > e8 01 00 00 83 e8 01 89 83 e8 80 01 00 48 8b 83 f0 80 01 00 48 85 c0 74 > > 0a <48> 8b 00 48 89 83 f0 80 01 00 45 31 e4 eb ad 31 f6 48 89 df e8 f5 > > [ 6.980584] RSP: 0018:ffffba03413a7930 EFLAGS: 00010202 > > [ 6.980599] RAX: 0000000100000019 RBX: ffffba03413c9000 RCX: > > 000000000000000c > > [ 6.980618] RDX: 0000000000000000 RSI: ffffba03413a7978 RDI: > > ffffba03413c9000 > > [ 6.980637] RBP: ffffba03413a7958 R08: 000000000000000c R09: > > ffffba03413c90cc > > [ 6.980655] R10: ffff9ed0a73b8000 R11: ffff9ed0a7e28000 R12: > > ffff9ed0a7e28000 > > [ 6.980674] R13: ffffba03413c9000 R14: ffff9ed0a6e70ded R15: > > ffff9ed0a6d39918 > > [ 6.980692] FS: 00007f92f5113680(0000) GS:ffff9ed0b6bc0000(0000) > > knlGS:0000000000000000 > > [ 6.980712] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 6.980728] CR2: 0000000100000019 CR3: 00000002a6cc4004 CR4: > > 00000000003606e0 > > [ 6.980746] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > > 0000000000000000 > > [ 6.980765] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > > 0000000000000400 > > [ 6.980783] Call Trace: > > [ 6.980795] hid_open_report+0x127/0x2b0 [hid] > > [ 6.980811] sensor_hub_probe+0x83/0x420 [hid_sensor_hub] > > [ 6.980827] ? hid_match_id+0x2f/0x50 [hid] > > [ 6.980842] hid_device_probe+0x10c/0x170 [hid] > > [ 6.980858] really_probe+0x22c/0x410 > > [ 6.980876] driver_probe_device+0x11a/0x140 > > [ 6.980895] __device_attach_driver+0x8f/0x100 > > [ 6.980917] ? __driver_attach+0x120/0x120 > > [ 6.980951] bus_for_each_drv+0x69/0xb0 > > [ 6.980971] ? hid_destroy_device+0x60/0x60 [hid] > > [ 6.980993] __device_attach+0xdd/0x160 > > [ 6.981012] ? __hid_register_driver+0x80/0x80 [hid] > > [ 6.981034] ? hid_destroy_device+0x60/0x60 [hid] > > [ 6.981067] device_attach+0x10/0x20 > > [ 6.981080] bus_rescan_devices_helper+0x47/0x80 > > [ 6.981094] device_reprobe+0x59/0x80 > > [ 6.981107] __hid_bus_reprobe_drivers+0x63/0x70 [hid] > > [ 6.981123] bus_for_each_dev+0x6a/0xc0 linux-input@xxxxxxxxxxxxxxx > > [ 6.981136] __hid_bus_driver_added+0x2c/0x40 [hid] > > [ 6.981150] bus_for_each_drv+0x69/0xb0 > > [ 6.981163] __hid_register_driver+0x6f/0x80 [hid] > > [ 6.981178] ? 0xffffffffc0129000 > > [ 6.981190] sensor_hub_driver_init+0x23/0x1000 [hid_sensor_hub] > > [ 6.981208] do_one_initcall+0x52/0x1d7 > > [ 6.981222] ? _cond_resched+0x1a/0x50 > > [ 6.981235] ? kmem_cache_alloc_trace+0x170/0x1d0 > > [ 6.981250] do_init_module+0x5f/0x221 > > [ 6.981263] load_module+0x264b/0x2ae0 > > [ 6.981277] __do_sys_finit_module+0xe5/0x120 > > [ 6.981290] ? __do_sys_finit_module+0xe5/0x120 > > [ 6.981305] __x64_sys_finit_module+0x1a/0x20 > > [ 6.981320] do_syscall_64+0x5a/0x140 > > [ 6.981332] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [ 6.981347] RIP: 0033:0x7f92f4c1d839 > > [ 6.981360] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 > > 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 > > 08around 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 > > 64 89 01 48 > > [ 6.981399] RSP: 002b:00007ffd6be39ff8 EFLAGS: 00000246 ORIG_RAX: > > 0000000000000139HID: core: store the collections as a basic tree > > [ 6.981419] RAX: ffffffffffffffda RBX: 000055bd0d69d940 RCX: > > 00007f92f4c1d839 > > [ 6.981437] RDX: 0000000000000000 RSI: 00007f92f48fc0e5 RDI: > > 0000000000000005 > > [ 6.981455] RBP: 00007f92f48fc0e5 R08: 0000000000000000 R09: > > 00007ffd6be3a110 > > [ 6.981473] R10: 0000000000000005 R11: 0000000000000246 R12: > > 0000000000000000 > > [ 6.981492] R13: 000055bd0d697db0 R14: 0000000000020000 R15: > > 000055bd0d69d940 > > [ 6.981510] Modules linked in: hid_sensor_hub(+) intel_ishtp_hid > > hid_generic intel_ish_ipc ahci sdhci_pci cqhci e1000e usbhid libahci > > sdhci hid intel_ishtp wmi pinctrl_sunrisepoint pinctrl_intel > > [ 6.981553] CR2: 0000000100000019 > > [ 6.981565] ---[ end trace fbbd8d33ebb5ae31 ]--- > > [ 6.981580] RIP: 0010:hid_parser_main+0xd2/0x300 [hid] > > [ 6.981595] Code: 5d 41 5e 41 5f 5d c3 8b 83 e8 80 01 00 85 c0 0f 84 > > e8 01 00 00 83 e8 01 89 83 e8 80 01 00 48 8b 83 f0 80 01 00 48 85 c0 74 > > 0a <48> 8b 00 48 89 83 f0 80 01 00 45 31 e4 eb ad 31 f6 48 89 df e8 f5 > > [ 6.981635] RSP: 0018:ffffba03413a7930 EFLAGS: 00010202 > > [ 6.981649] RAX: 0000000100000019 RBX: ffffba03413c9000 RCX: > > 000000000000000c > > [ 6.981668] RDX: 0000000000000000 RSI: ffffba03413a7978 RDI: > > ffffba03413c9000 > > [ 6.981686] RBP: ffffba03413a7958 R08: 000000000000000c R09: HID: > > core: store the collections as a basic treeffffba03413c90cc > > [ 6.981705] R10: ffff9ed0a73b8000 R11: ffff9ed0a7e28000 R12: > > ffff9ed0a7e28000 > > [ 6.981723] R13: ffffba03413c9000 R14: ffff9ed0a6e70ded R15: > > ffff9ed0a6d39918 > > [ 6.981742] FS: 00007f92f5113680(0000) GS:ffff9ed0b6bc0000(0000) > > knlGS:0000000000000000 > > [ 6.981762] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 6.981778] CR2: 0000000100000019 CR3: 00000002a6cc4004 CR4: > > 00000000003606e0 > > [ 6.981796] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > > 0000000000000000 > > [ 6.981815] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > > 0000000000000400 > > [ 6.997410] EXT4-fs (nvme0n1p2): mounted filesystem with ordered > > data mode. Opts: (null) > > This is by no way meant as a final fix, but just to verify whether this is > the race condition I think it is -- does the patch below cure the > symptoms? close enough :) see my patch in the other email, we crossed streams here. The problem is (in addition) that each collection->parent still points to the previous memory region that has since been released. So we could either reset all pointers on realloc (doable with a bit of pointer maths) or switch to indices like in that patch. Uglier, but safer. Cheers, Peter > diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c > index f41d5fe51abe..b0fc95e6a0b7 100644 > --- a/drivers/hid/hid-core.c > +++ b/drivers/hid/hid-core.c > @@ -123,7 +123,7 @@ static struct hid_field *hid_register_field(struct hid_report *report, unsigned > > static int open_collection(struct hid_parser *parser, unsigned type) > { > - struct hid_collection *collection; > +i struct hid_collection *collection, *old_collection; > unsigned usage; > > usage = parser->local.usage[0]; > @@ -159,9 +159,11 @@ static int open_collection(struct hid_parser *parser, unsigned type) > memset(collection + parser->device->collection_size, 0, > sizeof(struct hid_collection) * > parser->device->collection_size); > - kfree(parser->device->collection); > + old_collection = parser->device->collection; > parser->device->collection = collection; > parser->device->collection_size *= 2; > + parser->active_collection = collection; > + kfree(old_collection); > } > > parser->collection_stack[parser->collection_stack_ptr++] = > > -- > Jiri Kosina > SUSE Labs >