On Tue, Mar 15, 2016 at 11:09 PM, Ming Lei <ming.lei@xxxxxxxxxxxxx> wrote: > Hi, > > On Mon, Mar 14, 2016 at 8:00 PM, <clingutla@xxxxxxxxxxxxxx> wrote: >> Hello, >> >> I see possible race in _request_firmware_load function, on which I wanted to >> take your opinion. >> >> When system is going to low power mode, device_cache_fw_images() is called >> from pm notifier which schedules asynchronous workers to cache devices >> firmware. >> If more than 2 async requests falls under user helper mode, then there is a >> use after free of "firmware" directory kobject. > > If two requests are for getting same firmware image, only one can be sent > out and be completed with user helper, please see fw_lookup_and_allocate_buf(), > so the race shouldn't have been triggered in this case. > > So I guess your test must be involved with at least two different requests. > >> >> The race will hit, when one asynchronous worker calls device_del() while >> other asynchronous worker calls get_device_parent() in >> _request_firmware_load(). >> After loading firmware, first worker calls device_del(), and "firmware" >> directory ref count is 1, so cleanup_glue_dir() calls kobject_put() with >> gdp_mutex lock held. >> Meanwhile, second async worker calls device_add(), and get_device_parent() >> is still able to find the firmware directory kobject in ksets after getting >> gdp_mutex. >> This kobject gets added as parent to the second device kobject. >> By the time of accessing the parent kobject, its ref count was 0 and prints >> below warning stack followed by crash. > > The trick here is just that we don't use 'struct device' in > device_cache_fw_images(), > then the firmware device's parent is null during calling device_add() from > _request_firmware_load(), but this should be allowed by driver core. > > From firmware class view, both device_add()/device_del() is run with pair, > and driver core still complains during get_device_parent(), so something > inside driver core should be be wrong, IMO. Thinking about the issue further, I believe it is a false positive which is caused by CONFIG_DEBUG_KOBJECT_RELEASE, follows the story: 1) device_add() path get_device_parent() ...... mutex_lock(&gdp_mutex); /* find our class-directory at the parent and reference it */ spin_lock(&dev->class->p->glue_dirs.list_lock); list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry) if (k->parent == parent_kobj) { kobj = kobject_get(k); break; } spin_unlock(&dev->class->p->glue_dirs.list_lock); if (kobj) { mutex_unlock(&gdp_mutex); return kobj; } k = class_dir_create_and_add(dev->class, parent_kobj); mutex_unlock(&gdp_mutex); ...... 2) device_del() path cleanup_device_parent() ->cleanup_glue_dir() ->mutex_lock(&gdp_mutex); ->kobject_put(glue_dir); ->kobject_cleanup() ->kobject_del() ->kobj_kset_leave() ->mutex_unlock(&gdp_mutex); So gdp_mutex is held wrt. both looking up/adding the parent kobject and removing the parent kobject. Actually with this lock, both glur_dir kobject's referece counting and joining/leaving kset are run atomically. The race you found is triggered by CONFIG_DEBUG_KOBJECT_RELEASE, which causes kobject_cleanup() to be run in workqueue context and the lock of gdp_mutex can't be hold in that situation anymore. That is the root cause for your race. That also said DEBUG_KOBJECT_RELEASE might cause trouble in some situations. Thanks, Ming > >> >> [21147.455038] ------------[ cut here ]------------ >> [21147.458910] WARNING: at <...>/kernel/include/linux/kref.h:47 >> kobject_get+0x50/0x68() >> [21147.481199] Modules linked in: core_ctl(PO) qdrbg_module(O) >> qcrypto_module(O) >> [21147.481256] CPU: 2 PID: 23935 Comm: kworker/u16:8 Tainted: P W O >> 3.10.84-g0957845-00427-g56a05c2 #1 >> [21147.481284] Workqueue: events_unbound async_run_entry_fn >> [21147.481296] Call trace: >> [21147.481315] [<ffffffc000206b60>] dump_backtrace+0x0/0x270 >> [21147.481331] [<ffffffc000206de0>] show_stack+0x10/0x1c >> [21147.481353] [<ffffffc000d569ec>] dump_stack+0x1c/0x28 >> [21147.481370] [<ffffffc00021cc18>] warn_slowpath_common+0x74/0x9c >> [21147.481383] [<ffffffc00021ce4c>] warn_slowpath_null+0x14/0x20 >> [21147.481397] [<ffffffc000456b10>] kobject_get+0x4c/0x68 >> [21147.481416] [<ffffffc000609f64>] get_device_parent+0xa0/0x194 >> [21147.481430] [<ffffffc00060a3f4>] device_add+0x100/0x600 >> [21147.481446] [<ffffffc00061c390>] _request_firmware+0x8b4/0xa80 >> [21147.481459] [<ffffffc00061c58c>] request_firmware+0x30/0x3c >> [21147.481473] [<ffffffc00061c5ec>] cache_firmware+0x54/0xb0 >> [21147.481490] [<ffffffc00061c65c>] __async_dev_cache_fw_image+0x14/0x54 >> [21147.481505] [<ffffffc000243794>] async_run_entry_fn+0x6c/0x12c >> [21147.481521] [<ffffffc000237e88>] process_one_work+0x264/0x3dc >> [21147.481535] [<ffffffc0002392c0>] worker_thread+0x1f0/0x340 >> [21147.481553] [<ffffffc00023e500>] kthread+0xac/0xb8 >> [21147.481563] ---[ end trace dabc98ea48b8ba59 ]--- >> >> [21147.486436] Unable to handle kernel paging request at virtual address >> 0x6b6b6b6b6b6b6bcb >> [21147.493816] pgd = ffffffc0a982d000 >> [21147.498337] [6b6b6b6b6b6b6bcb] *pgd=0000000000000000 >> [21147.502287] Internal error: Oops: 96000004 [#1] PREEMPT SMP >> [21147.507826] Modules linked in: core_ctl(PO) qdrbg_module(O) >> qcrypto_module(O) >> [21147.514951] CPU: 2 PID: 23935 Comm: kworker/u16:8 Tainted: P W O >> 3.10.84-g0957845-00427-g56a05c2 #1 >> [21147.524686] Workqueue: events_unbound async_run_entry_fn >> [21147.529965] task: ffffffc008138000 ti: ffffffc007880000 task.ti: >> ffffffc007880000 >> [21147.537438] PC is at sysfs_create_dir+0x34/0xd4 >> [21147.541949] LR is at kobject_add_internal+0x120/0x24c >> [21147.546979] pc : [<ffffffc000361fa8>] lr : [<ffffffc000456f5c>] pstate: >> 80000145 >> [21147.996295] Process kworker/u16:8 (pid: 23935, stack limit = >> 0xffffffc007880058) >> [21148.003673] Call trace: >> [21148.006117] [<ffffffc000361fa8>] sysfs_create_dir+0x34/0xd4 >> [21148.011663] [<ffffffc000456f58>] kobject_add_internal+0x11c/0x24c >> [21148.017737] [<ffffffc0004573ec>] kobject_add+0xc8/0xe4 >> [21148.022863] [<ffffffc00060a410>] device_add+0x11c/0x600 >> [21148.028069] [<ffffffc00061c390>] _request_firmware+0x8b4/0xa80 >> [21148.033886] [<ffffffc00061c58c>] request_firmware+0x30/0x3c >> [21148.039439] [<ffffffc00061c5ec>] cache_firmware+0x54/0xb0 >> [21148.044822] [<ffffffc00061c65c>] __async_dev_cache_fw_image+0x14/0x54 >> [21148.051249] [<ffffffc000243794>] async_run_entry_fn+0x6c/0x12c >> [21148.057064] [<ffffffc000237e88>] process_one_work+0x264/0x3dc >> [21148.062792] [<ffffffc0002392c0>] worker_thread+0x1f0/0x340 >> [21148.068263] [<ffffffc00023e500>] kthread+0xac/0xb8 >> [21148.073039] Code: b5000094 14000026 d000ac54 9125a294 (7940c281) >> [21148.080312] ---[ end trace dabc98ea48b8ba5a ]--- >> >> Below is the callstack where the parent is freed from the slub_debug info. >> 0xFFFFFFC000608D74 \\vmlinux\base/core\class_dir_release+0x0C >> 0xFFFFFFC0002FE6BC \\vmlinux\slub\free_debug_processing\fail+0x114 >> 0xFFFFFFC0002FE7A0 \\vmlinux\slub\__slab_free+0x44 >> 0xFFFFFFC0002FEE60 \\vmlinux\slub\kfree+0x1F8 >> 0xFFFFFFC000608D70 \\vmlinux\base/core\class_dir_release+0x8 >> 0xFFFFFFC000456D70 \\vmlinux\kobject\kobject_release+0x134 >> 0xFFFFFFC000456E10 \\vmlinux\kobject\kobject_put+0x58 >> 0xFFFFFFC000456C28 \\vmlinux\kobject\kobject_del+0x64 >> 0xFFFFFFC000609D78 \\vmlinux\base/core\device_del+0x150 >> 0xFFFFFFC00061C524 \\vmlinux\firmware_class\_request_firmware\out+0x71C >> 0xFFFFFFC00061C58C \\vmlinux\firmware_class\request_firmware+0x30 >> 0xFFFFFFC00061C5EC \\vmlinux\firmware_class\cache_firmware+0x54 >> 0xFFFFFFC00061C65C \\vmlinux\firmware_class\__async_dev_cache_fw_image+0x14 >> 0xFFFFFFC000243794 \\vmlinux\async\async_run_entry_fn+0x6C >> 0xFFFFFFC000237E88 \\vmlinux\workqueue\process_one_work+0x264 >> 0xFFFFFFC0002392C0 \\vmlinux\workqueue\worker_thread\recheck+0x1A0 >> 0xFFFFFFC00023E500 \\vmlinux\kthread\kthread+0xAC >> >> For similar type of issue, I see there is an existing fix: "sysfs: driver >> core: Fix glue dir race condition by gdp_mutex". >> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/base/core.c?id=e4a60d139060975eb956717e4f63ae348d4d8cc5 >> >> Still I am able to reproduce the issue. I have verified this race on >> kernels: 3.10, 3.18 and 4.4. >> >> I followed below procedure to reproduce the issue: >> 1. Enable "CONFIG_DEBUG_KOBJECT_RELEASE" >> 2. Use test_firmware, modified the test for async calls >> 3. Replace WARN_ON with BUG_ON in kref_get(). >> I ran below script from shell: >> count=0 >> while [ 1 ] >> do >> echo 3 > /sys/devices/virtual/misc/test_firmware/trigger_ufw >> if [ $? -ne 0 ]; then >> print "Exiting.. " >> exit 1 >> fi >> count=$(($count +1)) >> echo "count $count" >> done >> >> With below patch, I could reproduce the issue in the 2nd iteration. > > I will run your test script to see if there is new findings. > >> >> -------------------------8<---------------------------------------------- >> >> To reproduce the race condition, use firmware test and >> CONFIG_DEBUG_KOBJECT_RELEASE=y, which delays kobject >> releases. So that we can catch the bug easily. >> >> Enabled below flags in kernel config file: >> >> +CONFIG_DEBUG_KOBJECT_RELEASE=y >> +CONFIG_TEST_FIRMWARE=y > > Looks I can't reproduce the issue with your approach. > >> >> >> diff --git a/include/linux/kref.h b/include/linux/kref.h >> index 484604d..727fb24 100644 >> --- a/include/linux/kref.h >> +++ b/include/linux/kref.h >> @@ -44,7 +44,7 @@ static inline void kref_get(struct kref *kref) >> * condition when this kref is freeing by some other thread right >> now. >> * In this case one should use kref_get_unless_zero() >> */ >> - WARN_ON_ONCE(atomic_inc_return(&kref->refcount) < 2); >> + BUG_ON(atomic_inc_return(&kref->refcount) < 2); >> } >> >> /** >> diff --git a/lib/test_firmware.c b/lib/test_firmware.c >> old mode 100644 >> new mode 100755 >> index 86374c1..14c9598 >> --- a/lib/test_firmware.c >> +++ b/lib/test_firmware.c >> @@ -18,6 +18,7 @@ >> #include <linux/miscdevice.h> >> #include <linux/slab.h> >> #include <linux/uaccess.h> >> +#include <linux/async.h> >> >> + >> +static ASYNC_DOMAIN_EXCLUSIVE(ufw_domain); >> + >> +static void __async_ufw_work(void *name, >> + async_cookie_t cookie) >> +{ >> + const struct firmware *fw; >> + const char *fw_name = name; >> + int ret; >> + >> + pr_err("requesting %s\n", fw_name); >> + >> + ret = request_firmware(&fw, fw_name, NULL); >> + pr_err("loaded: %zu\n", fw ? fw->size : 0); >> + if (!ret) >> + kfree(fw); >> + >> +} >> + >> +static char *name[] = { "fw1.bin", "fw2.bin"}; >> + >> +static ssize_t trigger_ufw_store(struct device *dev, >> + struct device_attribute *attr, >> + const char *buf, size_t count) >> +{ >> + int rc, i, c=0; >> + u32 start; >> + char *fw_name; >> + >> + rc = kstrtou32(buf, 0, &start); >> + if (rc){ >> + pr_err("Invalid option\n"); >> + return rc; >> + } >> + pr_err(" no of iterations %d\n", start); >> + >> + for (i=0; i< start; i++) >> + { >> + if (c >= 2) >> + c = 0; >> + fw_name = name[c++]; >> + async_schedule_domain(__async_ufw_work, (void *)fw_name, >> &ufw_domain); >> + } >> + async_synchronize_full_domain(&ufw_domain); >> + return count; >> +} >> >> static DEVICE_ATTR_WO(trigger_request); >> +static DEVICE_ATTR_WO(trigger_ufw); >> >> static int __init test_firmware_init(void) >> { >> @@ -92,6 +140,12 @@ static int __init test_firmware_init(void) >> goto dereg; >> } >> >> + rc = device_create_file(test_fw_misc_device.this_device, >> + &dev_attr_trigger_ufw); >> + if (rc) { >> + pr_err("could not create sysfs interface: %d\n", rc); >> + goto dereg; >> + } >> pr_warn("interface ready\n"); >> >> return 0; >> @@ -106,7 +160,9 @@ static void __exit test_firmware_exit(void) >> { >> release_firmware(test_firmware); >> device_remove_file(test_fw_misc_device.this_device, >> - &dev_attr_trigger_request); >> + &dev_attr_trigger_request); >> + device_remove_file(test_fw_misc_device.this_device, >> + &dev_attr_trigger_ufw); >> misc_deregister(&test_fw_misc_device); >> pr_warn("removed interface\n"); >> } >> >> ----------------------------8<-------------------------------------------- >> >> Thanks and regards, >> Chandrasekhar L. >> -- >> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of >> Code Aurora Forum, >> a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html