Yonghong Song wrote: > > > On 11/21/22 2:56 PM, Martin KaFai Lau wrote: > > On 11/21/22 12:01 PM, Yonghong Song wrote: > >> > >> > >> On 11/21/22 11:41 AM, Martin KaFai Lau wrote: > >>> On 11/21/22 9:05 AM, Yonghong Song wrote: > >>>> @@ -4704,6 +4715,15 @@ static int check_ptr_to_btf_access(struct > >>>> bpf_verifier_env *env, > >>>> return -EACCES; > >>>> } > >>>> + /* Access rcu protected memory */ > >>>> + if ((reg->type & MEM_RCU) && env->prog->aux->sleepable && > >>>> + !env->cur_state->active_rcu_lock) { > >>>> + verbose(env, > >>>> + "R%d is ptr_%s access rcu-protected memory with off=%d, > >>>> not rcu protected\n", > >>>> + regno, tname, off); > >>>> + return -EACCES; > >>>> + } > >>>> + > >>>> if (env->ops->btf_struct_access && !type_is_alloc(reg->type)) { > >>>> if (!btf_is_kernel(reg->btf)) { > >>>> verbose(env, "verifier internal error: reg->btf must > >>>> be kernel btf\n"); > >>>> @@ -4731,12 +4751,27 @@ static int check_ptr_to_btf_access(struct > >>>> bpf_verifier_env *env, > >>>> if (ret < 0) > >>>> return ret; > >>>> + /* The value is a rcu pointer. The load needs to be in a rcu > >>>> lock region, > >>>> + * similar to rcu_dereference(). > >>>> + */ > >>>> + if ((flag & MEM_RCU) && env->prog->aux->sleepable && > >>>> !env->cur_state->active_rcu_lock) { > >>>> + verbose(env, > >>>> + "R%d is rcu dereference ptr_%s with off=%d, not in > >>>> rcu_read_lock region\n", > >>>> + regno, tname, off); > >>>> + return -EACCES; > >>>> + } > >>> > >>> Would this make the existing rdonly use case fail? > >>> > >>> SEC("fentry.s/" SYS_PREFIX "sys_getpgid") > >>> int task_real_parent(void *ctx) > >>> { > >>> struct task_struct *task, *real_parent; > >>> > >>> task = bpf_get_current_task_btf(); > >>> real_parent = task->real_parent; > >>> bpf_printk("pid %u\n", real_parent->pid); > >>> return 0; > >>> } > >> > >> Right, it will fail. To fix the issue, user can do > >> bpf_rcu_read_lock(); > >> real_parent = task->real_parent; > >> bpf_printk("pid %u\n", real_parent->pid); > >> bpf_rcu_read_unlock(); > >> > >> But this raised a good question. How do we deal with > >> legacy sleepable programs with newly-added rcu tagging > >> capabilities. > >> > >> My current option is to error out if rcu usage is not right. > >> But this might break existing sleepable programs. > >> > >> Another option intends to not break existing, like above, > >> codes. In this case, MEM_RCU will not tagged if it is > >> not inside bpf_rcu_read_lock() region. > > > > hmm.... it is to make MEM_RCU to mean a reg is protected by the current > > active_rcu_lock or not? > > Yes, for example, in 'real_parent = task->real_parent' where > 'real_parent' in task_struct is tagged with __rcu in the struct > definition. So the 'real_parent' variable in the above assignment > will be tagged with MEM_RCU. > > > > >> In this case, the above non-rcu-protected code should work. And the > >> following should work as well although it is a little > >> bit awkward. > >> real_parent = task->real_parent; // real_parent not tagged with rcu > >> bpf_rcu_read_lock(); > >> bpf_printk("pid %u\n", real_parent->pid); > >> bpf_rcu_read_unlock(); > > > > I think it should be fine. bpf_rcu_read_lock() just not useful in this > > example but nothing break or crash. Also, after bpf_rcu_read_unlock(), > > real_parent will continue to be readable because the MEM_RCU is not set? > > That is correct. the variable real_parent is not tagged with MEM_RCU > and it will stay that way for the rest of its life cycle. > > With new PTR_TRUSTED mechanism, real_parent will be marked as normal > PTR_TO_BTF_ID and it is not marked as PTR_UNTRUSTED for backward > compatibility. So in the above code, real_parent->pid is just a normal > load (not related to rcu/trusted/untrusted). People may think it > is okay, but actually it does not okay. Verifier could add more state > to issue proper warnings, but I am not sure whether it is worthwhile > or not. As you mentioned, nothing breaks. It is just the current > existing way. So we should be able to live with this. > > > > > On top of the active_rcu_lock, should MEM_RCU be set only when it is > > dereferenced from a PTR_TRUSTED ptr (or with ref_obj_id != 0)? > > I didn't consider PTR_TRUSTED because it is just introduced yesterday... > > My current implementation inherits the old ptr_to_btf_id way where by > default any ptr_to_btf_id is trusted. But since we have PTR_TRUSTED > we should be able to use it for a stronger guarantee. > > > I am thinking about the following more common case: > > > > /* bpf_get_current_task_btf() may need to be changed > > * to set PTR_TRUSTED at the retval? > > */ > > /* task: PTR_TO_BTF_ID | PTR_TRUSTED */ > > task = bpf_get_current_task_btf(); > > > > bpf_rcu_read_lock(); > > > > /* real_parent: PTR_TO_BTF_ID | PTR_TRUSTED | MEM_RCU */ > > real_parent = task->real_parent; > > > > /* bpf_task_acquire() needs to change to use > > refcount_inc_not_zero */ > > real_parent = bpf_task_acquire(real_parent); > > > > bpf_rcu_read_unlock(); > > > > /* real_parent is accessible here (after checking NULL) and > > * can be passed to kfunc > > */ > > > > Yes, the above is a typical use case. Or alternatively after > real_parent = task->real_parent; > /* use real_parent inside the bpf_rcu_read_lock() region */ > > I will try to utilize PTR_TRUSTED concept in the next revision. Also perhaps interesting is when task is read out of a map with reference already pinned. I think you should clear the MEM_RCU tag on all referenced objects?