Re: [PATCH v3 bpf-next 1/3] bpf: Add bpf_iter_cpumask kfuncs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 1/19/2024 6:27 AM, Yonghong Song wrote:
>
> On 1/16/24 6:48 PM, Yafang Shao wrote:
>> Add three new kfuncs for bpf_iter_cpumask.
>> - bpf_iter_cpumask_new
>>    It is defined with KF_RCU_PROTECTED and KF_RCU.
>>    KF_RCU_PROTECTED is defined because we must use it under the
>>    protection of RCU.
>>    KF_RCU is defined because the cpumask must be a RCU trusted pointer
>>    such as task->cpus_ptr.
>
> I am not sure whether we need both or not.
>
> KF_RCU_PROTECTED means the function call needs within the rcu cs.
> KF_RCU means the argument usage needs within the rcu cs.
> We only need one of them (preferrably KF_RCU).
>
>> - bpf_iter_cpumask_next
>> - bpf_iter_cpumask_destroy
>>
>> These new kfuncs facilitate the iteration of percpu data, such as
>> runqueues, psi_cgroup_cpu, and more.
>>
>> Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
>> ---
>>   kernel/bpf/cpumask.c | 69 ++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 69 insertions(+)
>>
>> diff --git a/kernel/bpf/cpumask.c b/kernel/bpf/cpumask.c
>> index 2e73533a3811..1840e48e6142 100644
>> --- a/kernel/bpf/cpumask.c
>> +++ b/kernel/bpf/cpumask.c
>> @@ -422,6 +422,72 @@ __bpf_kfunc u32 bpf_cpumask_weight(const struct
>> cpumask *cpumask)
>>       return cpumask_weight(cpumask);
>>   }
>>   +struct bpf_iter_cpumask {
>> +    __u64 __opaque[2];
>> +} __aligned(8);
>> +
>> +struct bpf_iter_cpumask_kern {
>> +    const struct cpumask *mask;
>> +    int cpu;
>> +} __aligned(8);
>> +
>> +/**
>> + * bpf_iter_cpumask_new() - Create a new bpf_iter_cpumask for a
>> specified cpumask
>> + * @it: The new bpf_iter_cpumask to be created.
>> + * @mask: The cpumask to be iterated over.
>> + *
>> + * This function initializes a new bpf_iter_cpumask structure for
>> iterating over
>> + * the specified CPU mask. It assigns the provided cpumask to the
>> newly created
>> + * bpf_iter_cpumask @it for subsequent iteration operations.
>> + *
>> + * On success, 0 is returen. On failure, ERR is returned.
>> + */
>> +__bpf_kfunc int bpf_iter_cpumask_new(struct bpf_iter_cpumask *it,
>> const struct cpumask *mask)
>> +{
>> +    struct bpf_iter_cpumask_kern *kit = (void *)it;
>> +
>> +    BUILD_BUG_ON(sizeof(struct bpf_iter_cpumask_kern) >
>> sizeof(struct bpf_iter_cpumask));
>> +    BUILD_BUG_ON(__alignof__(struct bpf_iter_cpumask_kern) !=
>> +             __alignof__(struct bpf_iter_cpumask));
>> +
>> +    kit->mask = mask;
>> +    kit->cpu = -1;
>> +    return 0;
>> +}
>
> We have problem here. Let us say bpf_iter_cpumask_new() is called
> inside rcu cs.
> Once the control goes out of rcu cs, 'mask' could be freed, right?
> Or you require bpf_iter_cpumask_next() needs to be in the same rcu cs
> as bpf_iter_cpumask_new(). But such a requirement seems odd.

So the case is possible when using bpf_iter_cpumask_new() and
bpf_iter_cpumask_next() in sleepable program and these two kfuncs are
used in two different rcu_read_lock/rcu_read_unlock code blocks, right ?
>
> I think we can do things similar to bpf_iter_task_vma. You can
> allocate memory
> with bpf_mem_alloc() in bpf_iter_cpumask_new() to keep a copy of mask.
> This
> way, you do not need to worry about potential use-after-free issue.
> The memory can be freed with bpf_iter_cpumask_destroy().
>
>> +
>> +/**
>> + * bpf_iter_cpumask_next() - Get the next CPU in a bpf_iter_cpumask
>> + * @it: The bpf_iter_cpumask
>> + *
>> + * This function retrieves a pointer to the number of the next CPU
>> within the
>> + * specified bpf_iter_cpumask. It allows sequential access to CPUs
>> within the
>> + * cpumask. If there are no further CPUs available, it returns NULL.
>> + *
>> + * Returns a pointer to the number of the next CPU in the cpumask or
>> NULL if no
>> + * further CPUs.
>> + */
>> +__bpf_kfunc int *bpf_iter_cpumask_next(struct bpf_iter_cpumask *it)
>> +{
>> +    struct bpf_iter_cpumask_kern *kit = (void *)it;
>> +    const struct cpumask *mask = kit->mask;
>> +    int cpu;
>> +
>> +    cpu = cpumask_next(kit->cpu, mask);
>> +    if (cpu >= nr_cpu_ids)
>> +        return NULL;
>> +
>> +    kit->cpu = cpu;
>> +    return &kit->cpu;
>> +}
>> +
>> +/**
>> + * bpf_iter_cpumask_destroy() - Destroy a bpf_iter_cpumask
>> + * @it: The bpf_iter_cpumask to be destroyed.
>> + */
>> +__bpf_kfunc void bpf_iter_cpumask_destroy(struct bpf_iter_cpumask *it)
>> +{
>> +}
>> +
>>   __bpf_kfunc_end_defs();
>>     BTF_SET8_START(cpumask_kfunc_btf_ids)
>> @@ -450,6 +516,9 @@ BTF_ID_FLAGS(func, bpf_cpumask_copy, KF_RCU)
>>   BTF_ID_FLAGS(func, bpf_cpumask_any_distribute, KF_RCU)
>>   BTF_ID_FLAGS(func, bpf_cpumask_any_and_distribute, KF_RCU)
>>   BTF_ID_FLAGS(func, bpf_cpumask_weight, KF_RCU)
>> +BTF_ID_FLAGS(func, bpf_iter_cpumask_new, KF_ITER_NEW |
>> KF_RCU_PROTECTED | KF_RCU)
>> +BTF_ID_FLAGS(func, bpf_iter_cpumask_next, KF_ITER_NEXT | KF_RET_NULL)
>> +BTF_ID_FLAGS(func, bpf_iter_cpumask_destroy, KF_ITER_DESTROY)
>>   BTF_SET8_END(cpumask_kfunc_btf_ids)
>>     static const struct btf_kfunc_id_set cpumask_kfunc_set = {
>
> .





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux