> Sorry, I know nothing about the ThreadSanitizer and related annotation, > could you provide some information about it, thanks. Documentation/dev-tools/kcsan.rst > > Would be good to have some commentary why doing so > > many write operations with merely a rcu_read_lock as protection is safe. > > It might be safer to put some write type operations under a real lock. > > Also it is unclear how the RCU grace period for utasks is enforced. > > You are right, but I think using atomic refcount routine might be a more > suitable apprach for this scenario. The slot_ret field of utask instance Does it really all need to be lockless? Perhaps you can only make the common case lockless, but then only when the list overflows take a lock and avoid a lot of races. That might be good enough for performance. If you really want a complex lockless scheme you need very clear documentation in comments and commit logs at least. Also there should be a test case that stresses the various cases. I would just use a lock > is used to track the status of insn_slot. slot_ret supports three values. > A value of 2 means the utask associated insn_slot is currently in use by > uprobe. A value of 1 means the slot is no being used by uprobe. A value > of 0 means the slot has been reclaimed. So in some term, the atomic refcount > routine test_and_pout_task_slot() also avoid the racing when writing to > the utask instance, providing additional status information about insn_slot. > > BTW, You reminded me that since it might recycle the slot after deleting the > utask from the garbage collection list, so it's necessary to use > test_and_put_task_slot() to avoid the racing on the stale utask. the correct > code might be something like this: > > @@ -1771,16 +1783,16 @@ static void xol_free_insn_slot(struct task_struct *tsk) > > spin_lock_irqsave(&area->list_lock, flags); > list_del_rcu(&tsk->utask->gc); > + /* Ensure the slot is not in use or reclaimed on other CPU */ > + if (test_and_put_task_slot(tsk->utask)) { > + clear_bit(tsk->utask->insn_slot, area->bitmap); > + atomic_dec(&area->slot_count); > + tsk->utask->insn_slot = UINSNS_PER_PAGE; > + get_task_slot(tsk->utask); > + } I would have expected you would add a if (racing) bail out, assume the other CPU will do the work type check but that doesn't seem to be what the code is doing. -Andi