[PATCH bpf-next 0/7] Free htab element out of bucket lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Hou Tao <houtao1@xxxxxxxxxx>

Hi,

The patch set continues the previous work [1] to move all the freeings
of htab elements out of bucket lock. One motivation for the patch set is
the locking problem reported by Sebastian [2]: the freeing of bpf_timer
under PREEMPT_RT may acquire a spin-lock (namely softirq_expiry_lock).
However the freeing procedure for htab element has already held a
raw-spin-lock (namely bucket lock), and it will trigger the warning:
"BUG: scheduling while atomic" as demonstrated by the selftests patch.
Another motivation is to reduce the locked scope of bucket lock.

The patch set is structured as follows:

* Patch #1 moves the element freeing out of lock for
  htab_lru_map_delete_node()
* Patch #2~#3 move the element freeing out of lock for
  __htab_map_lookup_and_delete_elem()
* Patch #4~#6 move the element freeing out of lock for
  htab_map_update_elem()
* Patch #7 adds a selftest for the locking problem

The changes for htab_map_update_elem() require some explanation. The
reason that the previous work [1] can't move the element freeing out of
the bucket lock for preallocated hash table is due to ->extra_elems
optimization. When alloc_htab_elem() returns, the existed-old element
has already been stashed in per-cpu ->extra_elems. To handle that, patch
#5~#7 break the reuse of ->extra_elems and the refill of ->extra_elems
into two independent steps, do resue with bucket lock being held and do
refill after unlocking the bucket lock. The downside is that concurrent
updates on the same CPU may need to pop free element from per-cpu list
instead of reusing ->extra_elems directly, but I think such case will be
rare.

Please see individual patches for more details. Comments are always
welcome.

[1]: https://lore.kernel.org/bpf/20241106063542.357743-1-houtao@xxxxxxxxxxxxxxx
[2]: https://lore.kernel.org/bpf/20241106084527.4gPrMnHt@xxxxxxxxxxxxx

Hou Tao (7):
  bpf: Free special fields after unlock in htab_lru_map_delete_node()
  bpf: Bail out early in __htab_map_lookup_and_delete_elem()
  bpf: Free element after unlock in __htab_map_lookup_and_delete_elem()
  bpf: Support refilling extra_elems in free_htab_elem()
  bpf: Factor out the element allocation for pre-allocated htab
  bpf: Free element after unlock for pre-allocated htab
  selftests/bpf: Add test case for the freeing of bpf_timer

 kernel/bpf/hashtab.c                          | 170 ++++++++++--------
 .../selftests/bpf/prog_tests/free_timer.c     | 165 +++++++++++++++++
 .../testing/selftests/bpf/progs/free_timer.c  |  71 ++++++++
 3 files changed, 332 insertions(+), 74 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/free_timer.c
 create mode 100644 tools/testing/selftests/bpf/progs/free_timer.c

-- 
2.29.2





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux