On Thu, Jan 26, 2023 at 1:34 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Thu, 2023-01-26 at 11:50 -0800, Andrii Nakryiko wrote: > > On Tue, Jan 24, 2023 at 2:04 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > > > This is a followup for [1], adds an overview for the register liveness > > > tracking, covers the following points: > > > - why register liveness tracking is useful; > > > - how register parentage chains are constructed; > > > - how liveness marks are applied using the parentage chains. > > > > > > [1] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@xxxxxxxxxxxxxx/ > > > > > > Signed-off-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > > > --- > > > Documentation/bpf/verifier.rst | 237 +++++++++++++++++++++++++++++++++ > > > 1 file changed, 237 insertions(+) > > > [...] > > > +Register parent chains > > > +~~~~~~~~~~~~~~~~~~~~~~ > > > + > > > +In order to propagate information between parent and child states register > > > +parentage chain is established. Each register or spilled stack slot is linked to > > > +a corresponding register or stack spill slot in it's parent state via a > > > +``->parent`` pointer. This link is established upon state creation in function > > > +``is_state_visited()`` and is never changed. > > > + > > > +The rules for correspondence between registers / stack spill slots are as > > > +follows: > > > + > > > +* For current stack frame registers and stack spill slots of the new state are > > > + linked to the registers and stack spill slots of the parent state with the > > > + same indices. > > > + > > > +* For outer stack frames only caller saved registers (r6-r9) and stack spill > > > + slots are linked to the registers and stack spill slots of the parent state > > > + with the same indices. > > > + > > > +This could be illustrated by the following diagram (arrows stand for > > > +``->parent`` pointers):: > > > + > > > + ... ; Frame #0, some instructions > > > + --- checkpoint #0 --- > > > + 1 : r6 = 42 ; Frame #0 > > > + --- checkpoint #1 --- > > > + 2 : call foo() ; Frame #0 > > > + ... ; Frame #1, instructions from foo() > > > + --- checkpoint #2 --- > > > + ... ; Frame #1, instructions from foo() > > > + --- checkpoint #3 --- > > > + exit ; Frame #1, return from foo() > > > + 3 : r1 = r6 ; Frame #0 <- current state > > > + > > > + +--------------------------+--------------------------+ > > > + | Frame #0 | Frame #1 | > > > + Checkpoint +--------------------------+--------------------------+ > > > + #0 | r0-r5 | r6-r9 | fp-8 ... | > > > + +--------------------------+ > > > + ^ ^ ^ > > > + | | | > > > + Checkpoint +--------------------------+ > > > + #1 | r0-r5 | r6-r9 | fp-8 ... | > > > + +--------------------------+ > > > + ^ ^ > > > + | | > > > + Checkpoint +--------------------------+--------------------------+ > > > + #2 | r0-r5 | r6-r9 | fp-8 ... | r0-r5 | r6-r9 | fp-8 ... | > > > + +--------------------------+--------------------------+ > > > > wouldn't r6-r9 in frame #1 point to r6-r9 of frame #0 in parent state? > > Or do they point to r6-r9 of frame #0 in the same checkpoint? Either > > way, should we clarify that part on the diagram somehow? > > This is the loop at the end of is_state_visited() that sets up the > register links (initially the links are zero because of the kzalloc() > call used for memory allocation): > > cur->parent = new; > ... > for (j = 0; j <= cur->curframe; j++) { > for (i = j < cur->curframe ? BPF_REG_6 : 0; i < BPF_REG_FP; i++) > cur->frame[j]->regs[i].parent = &new->frame[j]->regs[i]; > ... > } > > The `new` refers to `struct bpf_verifier_state` instance that would > remain in cache, the `cur` refers to the state that would be used to > continue verification. > > Note that all links in this loop are setup only for identical indices, > so registers from frame #1 can't point to registers from frame #0. > > As far as I understand the following happens (please correct me if I'm wrong): > - when checkpoint #1 is created from the state derived from > checkpoint #0, the cur->curframe is 0; > - when `call foo` is processed a new instance of `struct bpf_func_state` > is allocated to represent frame #1, all parent pointers within this > instance are set to null (see call `kzalloc(*callee)` in __check_func_call()); > - checkpoint #2 is derived from a state described above, thus frame #1 > r0-r9 parent links are null. > > I'll add a note on the diagram that the links are null. yes, you are right, they will be null and thinking about this a bit more there is no logical connection between r6-r9 of parent and child functions. It's just a requirement that child function saves/restores their values if they are used within that function. > > > > > > > + ^ ^ ^ ^ ^ > > > + | | | | | > > > + Checkpoint +--------------------------+--------------------------+ > > > + #3 | r0-r5 | r6-r9 | fp-8 ... | r0-r5 | r6-r9 | fp-8 ... | > > > + +--------------------------+--------------------------+ > > > + ^ ^ > > > + | | > > > + Current +--------------------------+ > > > + state | r0-r5 | r6-r9 | fp-8 ... | > > > + +--------------------------+ > > > + \ > > > + r6 read mark is propagated via > > > + these links all the way up to > > > + checkpoint #1. > > > + > > > +Liveness marks tracking > > > +~~~~~~~~~~~~~~~~~~~~~~~ > > > + > > > +For each processed instruction verifier propagates information about reads up > > > +the parentage chain and saves information about writes in the current state. > > > +The information about reads is propagated by function ``mark_reg_read()`` which > > > +could be summarized as follows:: > > > + > > > + mark_reg_read(struct bpf_reg_state *state): > > > + parent = state->parent > > > + while parent: > > > + if parent->live & REG_LIVE_WRITTEN: > > > + break > > > + if parent->live & REG_LIVE_READ64: > > > + break > > > + parent->live |= REG_LIVE_READ64 > > > + state = parent > > > + parent = state->parent > > > + > > > +Note: details about REG_LIVE_READ32 are omitted. > > > + > > > +Also note: the read marks are applied to the *parent* state while write marks > > > +are applied to the *current* state. > > > + > > > +Because stack writes could have different sizes ``REG_LIVE_WRITTEN`` marks are > > > +applied conservatively: stack spills are marked as written only if write size > > > +corresponds to the size of the register, see function ``save_register_state()`` > > > +for an example. > > > > so this paragraph mentions a very subtle interaction for stack slots > > that only get partial register spill and keep other bytes as > > STACK_MISC. I don't know how much into details we should go her, but > > maybe calling out explicitly that this has implications when we mix > > register spill and scalar data in a single 8-byte stack slot would be > > useful? > > I can add the following text: > > --- 8< ----------------------------- > > Consider the following example:: > > 0: (*u64)(r10 - 8) = 0 > --- checkpoint #0 --- > 1: (*u32)(r10 - 8) = 1 > 2: r1 = (*u32)(r10 - 8) > > Because write size at instruction (1) is smaller than register size > the write mark will not be added to fp-8 slot when (1) is verified, > thus the fp-8 read at instruction (2) would propagate read mark for > fp-8 up to checkpoint #0. > > ----------------------------- >8 --- > > wdyt? Looks good to me, thanks. > > > [...] > > > +Read marks propagation for cache hits > > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > + > > > +Another important point is handling of read marks when a previously verified > > > +state is found in the states cache. All read marks present on registers and > > > +stack spills of the cached state must be propagated over the parentage chain of > > > +the current state. Function ``propagate_liveness()`` handles this case. > > > + > > > +For example, consider the following state parentage chain:: > > > + > > > + +---+ > > > + A ---> | B | ---> exit > > > + | | > > > + C ---> | D | > > > + +---+ > > > + > > > +* Chain ``A -> B -> exit`` is verified first; > > > +* State ``B`` has read marks for registers ``r1`` and ``r2``; > > > +* State ``D`` is considered equivalent to state ``B``; > > > + > > > +* Read marks for ``r1`` and ``r2`` have to be added in state ``C``, otherwise > > > + state ``C`` might get mistakenly marked as equivalent to some future state > > > + ``E`` with different values for registers in question. > > > > So I'm a bit confused by this section. We are talking about read marks > > in B and D, and then imply that C has to get it as well, but we don't > > mention what happens in A. The implication here is probably that read > > marks that were bubbled up into A should be bubbled up into C, right? > > But that's not very obvious? Could there be cases where read marks in > > B are screened in A and never make into it? In that case similar thing > > will presumably happen for C? > > I agree it might be confusing, what about the following text instead: > > --- 8< ----------------------------- > > For example, consider the following state parentage chain (S is a > starting state, A-E are derived states, -> arrows show which state is > derived from which):: > > r1 read > <------------- A[r1] == 0 > +---+ C[r1] == 0 > S ---> A ---> | B | ---> exit E[r1] == 1 > | | | > ` ---> C ---> | D | > | +---+ > ` ---> E > ^ > ^ |___ suppose all these > | states are at insn #Y > suppose all these > states are at insn #X > > * Chain of states ``S -> A -> B -> exit`` is verified first. > > * While ``B -> exit`` is verified register ``r1`` is read and this nit: gmail complains about missing comma between "verified register", and it does make reading this easier > read mark is propagated up to state ``A``. > > * When chain of states ``C -> D`` is verified the state ``D`` turns > out to be equivalent to state ``B``. > > * The read mark for ``r1`` has to be propagated to state ``C``, > otherwise state ``C`` might get mistakenly marked as equivalent to > state ``E`` even though values for register ``r1`` differ between > ``C`` and ``E``. > > ----------------------------- >8 --- > > The reason I don't want to put an example program here is that example > programs I can come up with are even more confusing than explanation above. No, this is perfect and very clear, IMO. Just add the comma ;) > > > I'd also call out conceptual similarity of this after the fact > > propagation of liveness marks to precision marks. It's the same idea > > and need. > > Could you please suggest wording for this point? I don't have an easy and succinct way to explain this, tbh. This is not strictly necessary for understanding, so let's not add anything. > > > > > > + > > > Understanding eBPF verifier messages > > > ==================================== > > > > > > -- > > > 2.39.0 > > > >