On 11/21/22 11:41 AM, Martin KaFai Lau wrote:
On 11/21/22 9:05 AM, Yonghong Song wrote:
@@ -4704,6 +4715,15 @@ static int check_ptr_to_btf_access(struct
bpf_verifier_env *env,
return -EACCES;
}
+ /* Access rcu protected memory */
+ if ((reg->type & MEM_RCU) && env->prog->aux->sleepable &&
+ !env->cur_state->active_rcu_lock) {
+ verbose(env,
+ "R%d is ptr_%s access rcu-protected memory with off=%d,
not rcu protected\n",
+ regno, tname, off);
+ return -EACCES;
+ }
+
if (env->ops->btf_struct_access && !type_is_alloc(reg->type)) {
if (!btf_is_kernel(reg->btf)) {
verbose(env, "verifier internal error: reg->btf must be
kernel btf\n");
@@ -4731,12 +4751,27 @@ static int check_ptr_to_btf_access(struct
bpf_verifier_env *env,
if (ret < 0)
return ret;
+ /* The value is a rcu pointer. The load needs to be in a rcu lock
region,
+ * similar to rcu_dereference().
+ */
+ if ((flag & MEM_RCU) && env->prog->aux->sleepable &&
!env->cur_state->active_rcu_lock) {
+ verbose(env,
+ "R%d is rcu dereference ptr_%s with off=%d, not in
rcu_read_lock region\n",
+ regno, tname, off);
+ return -EACCES;
+ }
Would this make the existing rdonly use case fail?
SEC("fentry.s/" SYS_PREFIX "sys_getpgid")
int task_real_parent(void *ctx)
{
struct task_struct *task, *real_parent;
task = bpf_get_current_task_btf();
real_parent = task->real_parent;
bpf_printk("pid %u\n", real_parent->pid);
return 0;
}
Right, it will fail. To fix the issue, user can do
bpf_rcu_read_lock();
real_parent = task->real_parent;
bpf_printk("pid %u\n", real_parent->pid);
bpf_rcu_read_unlock();
But this raised a good question. How do we deal with
legacy sleepable programs with newly-added rcu tagging
capabilities.
My current option is to error out if rcu usage is not right.
But this might break existing sleepable programs.
Another option intends to not break existing, like above,
codes. In this case, MEM_RCU will not tagged if it is
not inside bpf_rcu_read_lock() region. In this case,
the above non-rcu-protected code should work. And the
following should work as well although it is a little
bit awkward.
real_parent = task->real_parent; // real_parent not tagged with rcu
bpf_rcu_read_lock();
bpf_printk("pid %u\n", real_parent->pid);
bpf_rcu_read_unlock();
Maybe we should take the second choice in the above instead?