Patch "bpf: Disable preemption when increasing per-cpu map_locked" has been added to the 6.0-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    bpf: Disable preemption when increasing per-cpu map_locked

to the 6.0-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     bpf-disable-preemption-when-increasing-per-cpu-map_l.patch
and it can be found in the queue-6.0 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 6bfab3bfec3c315cf1aa2520d8efa9cddcb4dfca
Author: Hou Tao <houtao1@xxxxxxxxxx>
Date:   Wed Aug 31 12:26:27 2022 +0800

    bpf: Disable preemption when increasing per-cpu map_locked
    
    [ Upstream commit 2775da21628738ce073a3a6a806adcbaada0f091 ]
    
    Per-cpu htab->map_locked is used to prohibit the concurrent accesses
    from both NMI and non-NMI contexts. But since commit 74d862b682f5
    ("sched: Make migrate_disable/enable() independent of RT"),
    migrate_disable() is also preemptible under CONFIG_PREEMPT case, so now
    map_locked also disallows concurrent updates from normal contexts
    (e.g. userspace processes) unexpectedly as shown below:
    
    process A                      process B
    
    htab_map_update_elem()
      htab_lock_bucket()
        migrate_disable()
        /* return 1 */
        __this_cpu_inc_return()
        /* preempted by B */
    
                                   htab_map_update_elem()
                                     /* the same bucket as A */
                                     htab_lock_bucket()
                                       migrate_disable()
                                       /* return 2, so lock fails */
                                       __this_cpu_inc_return()
                                       return -EBUSY
    
    A fix that seems feasible is using in_nmi() in htab_lock_bucket() and
    only checking the value of map_locked for nmi context. But it will
    re-introduce dead-lock on bucket lock if htab_lock_bucket() is re-entered
    through non-tracing program (e.g. fentry program).
    
    One cannot use preempt_disable() to fix this issue as htab_use_raw_lock
    being false causes the bucket lock to be a spin lock which can sleep and
    does not work with preempt_disable().
    
    Therefore, use migrate_disable() when using the spinlock instead of
    preempt_disable() and defer fixing concurrent updates to when the kernel
    has its own BPF memory allocator.
    
    Fixes: 74d862b682f5 ("sched: Make migrate_disable/enable() independent of RT")
    Reviewed-by: Hao Luo <haoluo@xxxxxxxxxx>
    Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx>
    Link: https://lore.kernel.org/r/20220831042629.130006-2-houtao@xxxxxxxxxxxxxxx
    Signed-off-by: Martin KaFai Lau <martin.lau@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 6c530a5e560a..ad09da139589 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -162,17 +162,25 @@ static inline int htab_lock_bucket(const struct bpf_htab *htab,
 				   unsigned long *pflags)
 {
 	unsigned long flags;
+	bool use_raw_lock;
 
 	hash = hash & HASHTAB_MAP_LOCK_MASK;
 
-	migrate_disable();
+	use_raw_lock = htab_use_raw_lock(htab);
+	if (use_raw_lock)
+		preempt_disable();
+	else
+		migrate_disable();
 	if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
 		__this_cpu_dec(*(htab->map_locked[hash]));
-		migrate_enable();
+		if (use_raw_lock)
+			preempt_enable();
+		else
+			migrate_enable();
 		return -EBUSY;
 	}
 
-	if (htab_use_raw_lock(htab))
+	if (use_raw_lock)
 		raw_spin_lock_irqsave(&b->raw_lock, flags);
 	else
 		spin_lock_irqsave(&b->lock, flags);
@@ -185,13 +193,18 @@ static inline void htab_unlock_bucket(const struct bpf_htab *htab,
 				      struct bucket *b, u32 hash,
 				      unsigned long flags)
 {
+	bool use_raw_lock = htab_use_raw_lock(htab);
+
 	hash = hash & HASHTAB_MAP_LOCK_MASK;
-	if (htab_use_raw_lock(htab))
+	if (use_raw_lock)
 		raw_spin_unlock_irqrestore(&b->raw_lock, flags);
 	else
 		spin_unlock_irqrestore(&b->lock, flags);
 	__this_cpu_dec(*(htab->map_locked[hash]));
-	migrate_enable();
+	if (use_raw_lock)
+		preempt_enable();
+	else
+		migrate_enable();
 }
 
 static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node);



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux