Re: [PATCH cgroup/for-3.15-fixes 2/2] cgroup_freezer: replace freezer->lock with freezer_mutex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> After 96d365e0b86e ("cgroup: make css_set_lock a rwsem and rename it
> to css_set_rwsem"), css task iterators requires sleepable context as
> it may block on css_set_rwsem.  I missed that cgroup_freezer was
> iterating tasks under IRQ-safe spinlock freezer->lock.  This leads to
> errors like the following on freezer state reads and transitions.
> 
>   BUG: sleeping function called from invalid context at /work
>  /os/work/kernel/locking/rwsem.c:20
>   in_atomic(): 0, irqs_disabled(): 0, pid: 462, name: bash
>   5 locks held by bash/462:
>    #0:  (sb_writers#7){.+.+.+}, at: [<ffffffff811f0843>] vfs_write+0x1a3/0x1c0
>    #1:  (&of->mutex){+.+.+.}, at: [<ffffffff8126d78b>] kernfs_fop_write+0xbb/0x170
>    #2:  (s_active#70){.+.+.+}, at: [<ffffffff8126d793>] kernfs_fop_write+0xc3/0x170
>    #3:  (freezer_mutex){+.+...}, at: [<ffffffff81135981>] freezer_write+0x61/0x1e0
>    #4:  (rcu_read_lock){......}, at: [<ffffffff81135973>] freezer_write+0x53/0x1e0
>   Preemption disabled at:[<ffffffff81104404>] console_unlock+0x1e4/0x460
> 
>   CPU: 3 PID: 462 Comm: bash Not tainted 3.15.0-rc1-work+ #10
>   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>    ffff88000916a6d0 ffff88000e0a3da0 ffffffff81cf8c96 0000000000000000
>    ffff88000e0a3dc8 ffffffff810cf4f2 ffffffff82388040 ffff880013aaf740
>    0000000000000002 ffff88000e0a3de8 ffffffff81d05974 0000000000000246
>   Call Trace:
>    [<ffffffff81cf8c96>] dump_stack+0x4e/0x7a
>    [<ffffffff810cf4f2>] __might_sleep+0x162/0x260
>    [<ffffffff81d05974>] down_read+0x24/0x60
>    [<ffffffff81133e87>] css_task_iter_start+0x27/0x70
>    [<ffffffff8113584d>] freezer_apply_state+0x5d/0x130
>    [<ffffffff81135a16>] freezer_write+0xf6/0x1e0
>    [<ffffffff8112eb88>] cgroup_file_write+0xd8/0x230
>    [<ffffffff8126d7b7>] kernfs_fop_write+0xe7/0x170
>    [<ffffffff811f0756>] vfs_write+0xb6/0x1c0
>    [<ffffffff811f121d>] SyS_write+0x4d/0xc0
>    [<ffffffff81d08292>] system_call_fastpath+0x16/0x1b
> 
> freezer->lock used to be used in hot paths but that time is long gone
> and there's no reason for the lock to be IRQ-safe spinlock or even
> per-cgroup.  In fact, given the fact that a cgroup may contain large
> number of tasks, it's not a good idea to iterate over them while
> holding IRQ-safe spinlock.
> 
> Let's simplify locking by replacing per-cgroup freezer->lock with
> global freezer_mutex.  This also makes the comments explaining the
> intricacies of policy inheritance and the locking around it as the
> states are protected by a common mutex.
> 
> The conversion is mostly straight-forward.  The followings are worth
> mentioning.
> 
> * freezer_css_online() no longer needs double locking.
> 
> * freezer_attach() now performs propagation simply while holding
>   freezer_mutex.  update_if_frozen() race no longer exists and the
>   comment is removed.
> 
> * freezer_fork() now tests whether the task is in root cgroup using
>   the new task_css_is_root() without doing rcu_read_lock/unlock().  If
>   not, it grabs freezer_mutex and performs the operation.
> 
> * freezer_read() and freezer_change_state() grab freezer_mutex across
>   the whole operation and pin the css while iterating so that each
>   descendant processing happens in sleepable context.
> 
> Fixes: 96d365e0b86e ("cgroup: make css_set_lock a rwsem and rename it to css_set_rwsem")
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>

Acked-by: Li Zefan <lizefan@xxxxxxxxxx>

--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux