Re: [PATCH block/for-linus] blkcg: fix use-after-free in __blkg_release_rcu() by making blkcg_gq refcnt an atomic_t

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 19 Jun 2014 17:42:57 -0400
Tejun Heo <tj@xxxxxxxxxx> wrote:

> Hello,
> 
> So, this patch should do.  Joe, Vivek, can one of you guys please
> verify that the oops goes away with this patch?

Tejun -- thanks for fixing!  Looks good here, no issues running w/slub
debug enabled.

-- Joe


> Jens, the original thread can be read at
> 
>   http://thread.gmane.org/gmane.linux.kernel/1720729
> 
> The fix converts blkg->refcnt from int to atomic_t.  It does some
> overhead but it should be minute compared to everything else which is
> going on and the involved cacheline bouncing, so I think it's highly
> unlikely to cause any noticeable difference.  Also, the refcnt in
> question should be converted to a perpcu_ref for blk-mq anyway, so the
> atomic_t is likely to go away pretty soon anyway.
> 
> Thanks.
> 
> ------- 8< -------
> __blkg_release_rcu() may be invoked after the associated request_queue
> is released with a RCU grace period inbetween.  As such, the function
> and callbacks invoked from it must not dereference the associated
> request_queue.  This is clearly indicated in the comment above the
> function.
> 
> Unfortunately, while trying to fix a different issue, 2a4fd070ee85
> ("blkcg: move bulk of blkcg_gq release operations to the RCU
> callback") ignored this and added [un]locking of @blkg->q->queue_lock
> to __blkg_release_rcu().  This of course can cause oops as the
> request_queue may be long gone by the time this code gets executed.
> 
>   general protection fault: 0000 [#1] SMP
>   CPU: 21 PID: 30 Comm: rcuos/21 Not tainted 3.15.0 #1
>   Hardware name: Stratus ftServer 6400/G7LAZ, BIOS BIOS Version 6.3:57 12/25/2013
>   task: ffff880854021de0 ti: ffff88085403c000 task.ti: ffff88085403c000
>   RIP: 0010:[<ffffffff8162e9e5>]  [<ffffffff8162e9e5>] _raw_spin_lock_irq+0x15/0x60
>   RSP: 0018:ffff88085403fdf0  EFLAGS: 00010086
>   RAX: 0000000000020000 RBX: 0000000000000010 RCX: 0000000000000000
>   RDX: 000060ef80008248 RSI: 0000000000000286 RDI: 6b6b6b6b6b6b6b6b
>   RBP: ffff88085403fdf0 R08: 0000000000000286 R09: 0000000000009f39
>   R10: 0000000000020001 R11: 0000000000020001 R12: ffff88103c17a130
>   R13: ffff88103c17a080 R14: 0000000000000000 R15: 0000000000000000
>   FS:  0000000000000000(0000) GS:ffff88107fca0000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00000000006e5ab8 CR3: 000000000193d000 CR4: 00000000000407e0
>   Stack:
>    ffff88085403fe18 ffffffff812cbfc2 ffff88103c17a130 0000000000000000
>    ffff88103c17a130 ffff88085403fec0 ffffffff810d1d28 ffff880854021de0
>    ffff880854021de0 ffff88107fcaec58 ffff88085403fe80 ffff88107fcaec30
>   Call Trace:
>    [<ffffffff812cbfc2>] __blkg_release_rcu+0x72/0x150
>    [<ffffffff810d1d28>] rcu_nocb_kthread+0x1e8/0x300
>    [<ffffffff81091d81>] kthread+0xe1/0x100
>    [<ffffffff8163813c>] ret_from_fork+0x7c/0xb0
>   Code: ff 47 04 48 8b 7d 08 be 00 02 00 00 e8 55 48 a4 ff 5d c3 0f 1f 00 66 66 66 66 90 55 48 89 e5
>   +fa 66 66 90 66 66 90 b8 00 00 02 00 <f0> 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f
>   +b7
>   RIP  [<ffffffff8162e9e5>] _raw_spin_lock_irq+0x15/0x60
>    RSP <ffff88085403fdf0>
> 
> The request_queue locking was added because blkcg_gq->refcnt is an int
> protected with the queue lock and __blkg_release_rcu() needs to put
> the parent.  Let's fix it by making blkcg_gq->refcnt an atomic_t and
> dropping queue locking in the function.
> 
> Given the general heavy weight of the current request_queue and blkcg
> operations, this is unlikely to cause any noticeable overhead.
> Moreover, blkcg_gq->refcnt is likely to be converted to percpu_ref in
> the near future, so whatever (most likely negligible) overhead it may
> add is temporary.
> 
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> Reported-by: Joe Lawrence <joe.lawrence@xxxxxxxxxxx>
> Cc: Vivek Goyal <vgoyal@xxxxxxxxxx>
> Link: http://lkml.kernel.org/g/alpine.DEB.2.02.1406081816540.17948@xxxxxxxxxxxxxxxxxxxxxxxxxxxx
> Cc: stable@xxxxxxxxxxxxxxx
> ---
>  block/blk-cgroup.c |    7 ++-----
>  block/blk-cgroup.h |   17 +++++++----------
>  2 files changed, 9 insertions(+), 15 deletions(-)
> 
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -80,7 +80,7 @@ static struct blkcg_gq *blkg_alloc(struc
>  	blkg->q = q;
>  	INIT_LIST_HEAD(&blkg->q_node);
>  	blkg->blkcg = blkcg;
> -	blkg->refcnt = 1;
> +	atomic_set(&blkg->refcnt, 1);
>  
>  	/* root blkg uses @q->root_rl, init rl only for !root blkgs */
>  	if (blkcg != &blkcg_root) {
> @@ -399,11 +399,8 @@ void __blkg_release_rcu(struct rcu_head
>  
>  	/* release the blkcg and parent blkg refs this blkg has been holding */
>  	css_put(&blkg->blkcg->css);
> -	if (blkg->parent) {
> -		spin_lock_irq(blkg->q->queue_lock);
> +	if (blkg->parent)
>  		blkg_put(blkg->parent);
> -		spin_unlock_irq(blkg->q->queue_lock);
> -	}
>  
>  	blkg_free(blkg);
>  }
> --- a/block/blk-cgroup.h
> +++ b/block/blk-cgroup.h
> @@ -18,6 +18,7 @@
>  #include <linux/seq_file.h>
>  #include <linux/radix-tree.h>
>  #include <linux/blkdev.h>
> +#include <linux/atomic.h>
>  
>  /* Max limits for throttle policy */
>  #define THROTL_IOPS_MAX		UINT_MAX
> @@ -104,7 +105,7 @@ struct blkcg_gq {
>  	struct request_list		rl;
>  
>  	/* reference count */
> -	int				refcnt;
> +	atomic_t			refcnt;
>  
>  	/* is this blkg online? protected by both blkcg and q locks */
>  	bool				online;
> @@ -257,13 +258,12 @@ static inline int blkg_path(struct blkcg
>   * blkg_get - get a blkg reference
>   * @blkg: blkg to get
>   *
> - * The caller should be holding queue_lock and an existing reference.
> + * The caller should be holding an existing reference.
>   */
>  static inline void blkg_get(struct blkcg_gq *blkg)
>  {
> -	lockdep_assert_held(blkg->q->queue_lock);
> -	WARN_ON_ONCE(!blkg->refcnt);
> -	blkg->refcnt++;
> +	WARN_ON_ONCE(atomic_read(&blkg->refcnt) <= 0);
> +	atomic_inc(&blkg->refcnt);
>  }
>  
>  void __blkg_release_rcu(struct rcu_head *rcu);
> @@ -271,14 +271,11 @@ void __blkg_release_rcu(struct rcu_head
>  /**
>   * blkg_put - put a blkg reference
>   * @blkg: blkg to put
> - *
> - * The caller should be holding queue_lock.
>   */
>  static inline void blkg_put(struct blkcg_gq *blkg)
>  {
> -	lockdep_assert_held(blkg->q->queue_lock);
> -	WARN_ON_ONCE(blkg->refcnt <= 0);
> -	if (!--blkg->refcnt)
> +	WARN_ON_ONCE(atomic_read(&blkg->refcnt) <= 0);
> +	if (atomic_dec_and_test(&blkg->refcnt))
>  		call_rcu(&blkg->rcu_head, __blkg_release_rcu);
>  }
>  

--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux