Bug report: Unable to handle kernel paging request at virtual address 00000000c0000010

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hi,everybody

There is a bug in the mainline code(https://kernel.googlesource.com/pub/scm/linux/kernel/git/stable/linux.git -b master).

The bug's call trace as follows:
refcount_t: addition on 0; use-after-free.
  WARNING: CPU: 1 PID: 342 at lib/refcount.c:25 refcount_warn_saturate+0xa0/0x148
  Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.79 12/28/2022
  Workqueue: events cpuset_hotplug_workfn
  Call trace:
   refcount_warn_saturate+0xa0/0x148
   __refcount_add.constprop.0+0x5c/0x80
   css_task_iter_advance_css_set+0xd8/0x210
   css_task_iter_advance+0xa8/0x120
   css_task_iter_next+0x94/0x158
   update_tasks_root_domain+0x58/0x98
   rebuild_root_domains+0xa0/0x1b0
   rebuild_sched_domains_locked+0x144/0x188
   cpuset_hotplug_workfn+0x138/0x5a0
   process_one_work+0x1e8/0x448
   worker_thread+0x228/0x3e0
   kthread+0xe0/0xf0
   ret_from_fork+0x10/0x20
  ---[ end trace 0000000000000000 ]---
  ------------[ cut here ]------------
  refcount_t: underflow; use-after-free.
  WARNING: CPU: 1 PID: 342 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
  Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.79 12/28/2022
  Workqueue: events cpuset_hotplug_workfn
  Call trace:
   refcount_warn_saturate+0xf4/0x148
   put_css_set_locked+0x80/0x98
   css_task_iter_end+0x70/0x160
   update_tasks_root_domain+0x68/0x98
   rebuild_root_domains+0xa0/0x1b0
   rebuild_sched_domains_locked+0x144/0x188
   cpuset_hotplug_workfn+0x138/0x5a0
   process_one_work+0x1e8/0x448
   worker_thread+0x228/0x3e0
   kthread+0xe0/0xf0
   ret_from_fork+0x10/0x20
  ---[ end trace 0000000000000000 ]---
  process 10324 (cpuhotplug_do_s) no longer affine to cpu1
  psci: CPU1 killed (polled 0 ms)
  Unable to handle kernel paging request at virtual address 00000000c0000010
  Internal error: Oops: 0000000096000004 [#1] SMP
  Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.79 12/28/2022
  Workqueue: cgroup_destroy css_free_rwork_fn
  Call trace:
   cgroup_apply_control_disable+0xb0/0x1f8
   rebind_subsystems+0x20c/0x548
   cgroup_destroy_root+0x64/0x240
   css_free_rwork_fn+0x18c/0x1a8
   process_one_work+0x1e8/0x448
   worker_thread+0x178/0x3e0
   kthread+0xe0/0xf0
   ret_from_fork+0x10/0x20
  Code: 91012842 8b020f62 f9400453 b4000293 (f9400a60)
  SMP: stopping secondary CPUs
  Starting crashdump kernel...
This bug occurs in concurrency scenarios, In the hotplug, update_tasks_root_domain will
iterate over all tasks on the cpuset/root domain, the code as follows:
static void update_tasks_root_domain(struct cpuset *cs)
{
         struct css_task_iter it;
         struct task_struct *task;

         css_task_iter_start(&cs->css, 0, &it); // hold css_set_lock in css_task_iter_start
                 ... //nolock time1: don't hold css_set_lock
         while ((task = css_task_iter_next(&it))) // hold css_set_lock in css_task_iter_next
                 dl_add_task_root_domain(task); //nolock time2: don't hold css_set_lock

         css_task_iter_end(&it);
}
The cgroup.e_csets will be traversed through css_task_iter, and it->cset_head will record
the head of the e_cset list that is currently traversed, we will hold css_set_lock in
css_task_iter_start or in css_task_iter_next, but we don't always hold the css_set_lock,
such as "nolock time1" and "nolock time2" in the code comments above.

During the time without css_set_lock in update_tasks_root_domain, if it->cur_cset(current css_set)
is migrated to another list, such as:
int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask)
{
     ...
      spin_lock_irq(&css_set_lock);
      hash_for_each(css_set_table, i, cset, hlist)
         list_move_tail(&cset->e_cset_node[ss->id], &dcgrp->e_csets[ss->id]);
     spin_unlock_irq(&css_set_lock);
     ...
}
The bug will be triggered. As follows:

#1> in css_task_iter_start(), it->cset_head = &css->cgroup->e_csets[css->ss->id]; list A
#2> in css_task_iter_next(&it), it->cur_cset=nodeA,return task
#3> move nodeA to listB, for example: rebind_subsystems(),list_move_tail(nodeA, listB),then nodeA->next = headB
#4> next css_task_iter_next, new = nodeA->next == headB
#5> headB is not a valid css_set, but now new != it->cset_head(nodeA), so headB will be referred to as a valid css_set
#6> get_css_set(headB), refcount warning

The following changes will increase the probability of this bug being triggered:
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index e4ca2dd2b764..120e0c23517f 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -66,6 +66,7 @@
  #include <linux/mutex.h>
  #include <linux/cgroup.h>
  #include <linux/wait.h>
+#include <linux/delay.h>

  DEFINE_STATIC_KEY_FALSE(cpusets_pre_enable_key);
  DEFINE_STATIC_KEY_FALSE(cpusets_enabled_key);
@@ -1073,8 +1074,10 @@ static void update_tasks_root_domain(struct cpuset *cs)

         css_task_iter_start(&cs->css, 0, &it);

-       while ((task = css_task_iter_next(&it)))
+       while ((task = css_task_iter_next(&it))) {
+               udelay(1000 * 10);
                 dl_add_task_root_domain(task);
+       }

         css_task_iter_end(&it);
  }

We can trigger this bug with ltp test cases(https://github.com/linux-test-project/ltp/blob/master/runtest/controllers):

step 1: create a process to execute the following usecases:
cpuhotplug02 cpuhotplug02.sh -c 1 -l 1
cpuhotplug03 cpuhotplug03.sh -c 1 -l 1
cpuhotplug04 cpuhotplug04.sh -l 1
cpuhotplug05 cpuhotplug05.sh -c 1 -l 1 -d /tmp
cpuhotplug06 cpuhotplug06.sh -c 1 -l 1
cpuhotplug07 cpuhotplug07.sh -c 1 -l 1 -d /usr/src/linux

step 2: create another process to execute the following usecases:
cpuset_base_ops cpuset_base_ops_testset.sh
cpuset_inherit cpuset_inherit_testset.sh
cpuset_exclusive cpuset_exclusive_test.sh
cpuset_hierarchy cpuset_hierarchy_test.sh
cpuset_syscall cpuset_syscall_testset.sh
cpuset_sched_domains cpuset_sched_domains_test.sh
cpuset_load_balance cpuset_load_balance_test.sh
cpuset_hotplug cpuset_hotplug_test.sh
cpuset_memory cpuset_memory_testset.sh

Looking forward to your reply.

Thanks.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux