On Wed, Mar 16, 2022 at 7:23 PM Martin KaFai Lau <kafai@xxxxxx> wrote: > > On Wed, Mar 16, 2022 at 12:54:00PM -0700, Joanne Koong wrote: > > From: Joanne Koong <joannelkoong@xxxxxxxxx> > > > > Currently, local storage memory can only be allocated atomically > > (GFP_ATOMIC). This restriction is too strict for sleepable bpf > > programs. > > > > In this patch, the verifier detects whether the program is sleepable, > > and passes the corresponding GFP_KERNEL or GFP_ATOMIC flag as a > > 5th argument to bpf_task/sk/inode_storage_get. This flag will propagate > > down to the local storage functions that allocate memory. > > > > Please note that bpf_task/sk/inode_storage_update_elem functions are > > invoked by userspace applications through syscalls. Preemption is > > disabled before bpf_task/sk/inode_storage_update_elem is called, which > > means they will always have to allocate memory atomically. > > > > The existing local storage selftests cover both the GFP_ATOMIC and the > > GFP_KERNEL cases in bpf_local_storage_update. > > > > v2 <- v1: > > * Allocate the memory before/after the raw_spin_lock_irqsave, depending > > on the gfp flags > > * Rename mem_flags to gfp_flags > > * Reword the comment "*mem_flags* is set by the bpf verifier" to > > "*gfp_flags* is a hidden argument provided by the verifier" > > * Add a sentence to the commit message about existing local storage > > selftests covering both the GFP_ATOMIC and GFP_KERNEL paths in > > bpf_local_storage_update. > > [ ... ] > > > struct bpf_local_storage_data * > > bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, > > - void *value, u64 map_flags) > > + void *value, u64 map_flags, gfp_t gfp_flags) > > { > > struct bpf_local_storage_data *old_sdata = NULL; > > - struct bpf_local_storage_elem *selem; > > + struct bpf_local_storage_elem *selem = NULL; > > struct bpf_local_storage *local_storage; > > unsigned long flags; > > int err; > > @@ -365,6 +366,9 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, > > !map_value_has_spin_lock(&smap->map))) > > return ERR_PTR(-EINVAL); > > > > + if (gfp_flags == GFP_KERNEL && (map_flags & ~BPF_F_LOCK) != BPF_NOEXIST) > > + return ERR_PTR(-EINVAL); > > + > > local_storage = rcu_dereference_check(*owner_storage(smap, owner), > > bpf_rcu_lock_held()); > > if (!local_storage || hlist_empty(&local_storage->list)) { > > @@ -373,11 +377,11 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, > > if (err) > > return ERR_PTR(err); > > > > - selem = bpf_selem_alloc(smap, owner, value, true); > > + selem = bpf_selem_alloc(smap, owner, value, true, gfp_flags); > > if (!selem) > > return ERR_PTR(-ENOMEM); > > > > - err = bpf_local_storage_alloc(owner, smap, selem); > > + err = bpf_local_storage_alloc(owner, smap, selem, gfp_flags); > > if (err) { > > kfree(selem); > > mem_uncharge(smap, owner, smap->elem_size); > > @@ -404,6 +408,12 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, > > } > > } > > > > + if (gfp_flags == GFP_KERNEL) { > > + selem = bpf_selem_alloc(smap, owner, value, true, gfp_flags); > I think this new path is not executed by the existing > "progs/local_storage.c" test which has sleepable lsm prog. At least a second > BPF_MAP_TYPE_TASK_STORAGE map (or SK_STORAGE) is needed? > or there is other selftest covering this new path that I missed? Thanks for your feedback. I think I'm misunderstanding how the progs/local_storage.c test and/or local storage works then. Would you mind explaining why a second map is needed? This is my (faulty) understanding of what is happening: 1) In "local_storage.c" in "SEC("lsm.s/inode_rename")" there is a call to bpf_inode_storage_get with the BPF_LOCAL_STORAGE_GET_F_CREATE flag set, which will call into bpf_local_storage_update (which will create the local storage + the selem, and put it in the RCU for that inode_storage_map) 2) Then, further down in the "local_storage.c" file in "SEC("lsm.s/bprm_committed_creds")", there is another call to bpf_inode_storage_get on the same inode_storage_map but on a different inode, also with the BPF_LOCAL_STORAGE_GET_F_CREATE flag set. This will also call into bpf_local_storage_update. 3) In bpf_local_storage_update from the call in #2, it sees that there is a local storage associated with this map in the RCU, it tries to look for the inode but doesn't find it, so it needs to allocate with GFP_KERNEL a new selem and then update with the new selem. I just tried out some debug statements locally to test, and it looks like my analysis of #3 is wrong. I will debug this some more tomorrow > > Others lgtm. > > Acked-by: Martin KaFai Lau <kafai@xxxxxx> > > > + if (!selem) > > + return ERR_PTR(-ENOMEM); > > + } > > + > > raw_spin_lock_irqsave(&local_storage->lock, flags); > > > > /* Recheck local_storage->list under local_storage->lock */