Re: [RFC PATCH bpf-next 0/6] bpf: Handle reuse in bpf memory alloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/14/23 8:02 PM, Hou Tao wrote:
For local storage, when its owner (sk/task/inode/cgrp) is going away, the
memory can be reused immediately. No rcu gp is needed.
Now it seems it will wait for RCU GP and i think it is still necessary, because
when the process exits, other processes may still access the local storage
through pidfd or task_struct of the exited process.

When its owner (sk/task/cgrp...) is going away, its owner has reached refcnt 0 and will be kfree immediately next. eg. bpf_sk_storage_free is called just before the sk is about to be kfree. No bpf prog should have a hold on this sk. The same should go for the task.

The current rcu gp waiting during bpf_{sk,task,cgrp...}_storage_free is because the racing with the map destruction bpf_local_storage_map_free().


The local storage delete case (eg. bpf_sk_storage_delete) is the only one that
needs to be freed by tasks_trace gp because another bpf prog (reader) may be
under the rcu_read_lock_trace(). I think the idea (BPF_REUSE_AFTER_RCU_GP) on
allowing reuse after vanilla rcu gp and free (if needed) after tasks_trace gp
can be extended to the local storage delete case. I think we can extend the
assumption that "sleepable progs (reader) can use explicit bpf_rcu_read_lock()
when they want to avoid uaf" to bpf_{sk,task,inode,cgrp}_storage_get() also.

It seems bpf_rcu_read_lock() & bpf_rcu_read_unlock() will be used to protect not
only bpf_task_storage_get(), but also the dereferences of the returned local
storage ptr, right ? I think qp-trie may also need this.

I think bpf_rcu_read_lock() is primarily for bpf prog.

The bpf_{sk,task,...}_storage_get() internal is easier to handle and probably will need to do its own rcu_read_lock() instead of depending on the bpf prog doing the bpf_rcu_read_lock() because the bpf prog may decide uaf is fine.

I also need the GFP_ZERO in bpf_mem_alloc, so will work on the GFP_ZERO and
the BPF_REUSE_AFTER_RCU_GP idea.  Probably will get the GFP_ZERO out first.
I will continue work on this patchset for GFP_ZERO and reuse flag. Do you mean
that you want to work together to implement BPF_REUSE_AFTER_RCU_GP ? How do we
cooperate together to accomplish that ?
Please submit the GFP_ZERO patch first. Kumar and I can use it immediately.

I have been hacking to make bpf's memalloc safe for the bpf_{sk,task,cgrp..}_storage_delete() and this safe-on-reuse piece still need works. The whole thing is getting pretty long, so my current plan is to put the safe-on-reuse piece aside for now, focus back on the immediate goal and make the common case deadlock free first. Meaning the bpf_*_storage_get(BPF_*_STORAGE_GET_F_CREATE) and the bpf_*_storage_free() will use the bpf_mem_cache_{alloc,free}. The bpf_*_storage_delete() will stay as-is to go through the call_rcu_tasks_trace() for now since delete is not the common use case.

In parallel, if you can post the BPF_REUSE_AFTER_RCU_GP, we can discuss based on your work. That should speed up the progress. If I finished the immediate goal for local storage and this piece is still pending, I will ping you first. Thoughts?




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux