Re: [PATCH bpf 05/11] bpf: Add bpf_map_of_map_fd_{get,put}_ptr() helpers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Paul,

On 11/10/2023 9:45 AM, Paul E. McKenney wrote:
> On Fri, Nov 10, 2023 at 09:06:56AM +0800, Hou Tao wrote:
>> Hi,
>>
>> On 11/10/2023 3:55 AM, Paul E. McKenney wrote:
>>> On Thu, Nov 09, 2023 at 07:55:50AM -0800, Alexei Starovoitov wrote:
>>>> On Wed, Nov 8, 2023 at 11:26 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>>>>> Hi,
>>>>>
>>>>> On 11/9/2023 2:36 PM, Martin KaFai Lau wrote:
>>>>>> On 11/7/23 6:06 AM, Hou Tao wrote:
>>>>>>> From: Hou Tao <houtao1@xxxxxxxxxx>
>>>>>>>
>>>>>>> bpf_map_of_map_fd_get_ptr() will convert the map fd to the pointer
>>>>>>> saved in map-in-map. bpf_map_of_map_fd_put_ptr() will release the
>>>>>>> pointer saved in map-in-map. These two helpers will be used by the
>>>>>>> following patches to fix the use-after-free problems for map-in-map.
>>>>>>>
>>>>>>> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx>
>>>>>>> ---
>>>>>>>   kernel/bpf/map_in_map.c | 51 +++++++++++++++++++++++++++++++++++++++++
>>>>>>>   kernel/bpf/map_in_map.h | 11 +++++++--
>>>>>>>   2 files changed, 60 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>>
>>>>> SNIP
>>>>>>> +void bpf_map_of_map_fd_put_ptr(void *ptr, bool need_defer)
>>>>>>> +{
>>>>>>> +    struct bpf_inner_map_element *element = ptr;
>>>>>>> +
>>>>>>> +    /* Do bpf_map_put() after a RCU grace period and a tasks trace
>>>>>>> +     * RCU grace period, so it is certain that the bpf program which is
>>>>>>> +     * manipulating the map now has exited when bpf_map_put() is
>>>>>>> called.
>>>>>>> +     */
>>>>>>> +    if (need_defer)
>>>>>> "need_defer" should only happen from the syscall cmd? Instead of
>>>>>> adding rcu_head to each element, how about
>>>>>> "synchronize_rcu_mult(call_rcu, call_rcu_tasks)" here?
>>>>> No. I have tried the method before, but it didn't work due to dead-lock
>>>>> (will mention that in commit message in v2). The reason is that bpf
>>>>> syscall program may also do map update through sys_bpf helper. Because
>>>>> bpf syscall program is running with sleep-able context and has
>>>>> rcu_read_lock_trace being held, so call synchronize_rcu_mult(call_rcu,
>>>>> call_rcu_tasks) will lead to dead-lock.
>>>> Dead-lock? why?
>>>>
>>>> I think it's legal to do call_rcu_tasks_trace() while inside RCU CS
>>>> or RCU tasks trace CS.
>>> Just confirming that this is the case.  If invoking call_rcu_tasks_trace()
>>> within under either rcu_read_lock() or rcu_read_lock_trace() deadlocks,
>>> then there is a bug that needs fixing.  ;-)
>> The case for dead-lock is that calling synchronize_rcu_mult(call_rcu,
>> call_rcu_tasks_trace) within under rcu_read_lock_trace() and I think it
>> is expected. The case that calling call_rcu_tasks_trace() with
>> rcu_read_lock_trace() being held is OK.
> Very good, you are quite right.  In this particular case, deadlock is
> expected behavior.
>
> The problem here is that synchronize_rcu_mult() doesn't just invoke its
> arguments, instead, it also waits for all of the corresponding grace
> periods to complete.  But if you call this while under the protection of
> rcu_read_lock_trace(), then synchronize_rcu_mult(call_rcu_tasks_trace)
> cannot return until the corresponding rcu_read_unlock_trace() is
> reached, but that rcu_read_unlock_trace() cannot be reached until after
> synchronize_rcu_mult(call_rcu_tasks_trace) returns.
>
> (I did leave out the call_rcu argument because it does not participate
> in this particular deadlock.)

Got it. Thanks for the detailed explanation.
>
> 							Thanx, Paul
>
> .





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux