On Wed, 2023-02-01 at 11:00 +0000, Edward Cree wrote: > On 31/01/2023 18:11, Eduard Zingerman wrote: > > This is a followup for [1], adds an overview for the register liveness > > tracking, covers the following points: > > - why register liveness tracking is useful; > > - how register parentage chains are constructed; > > - how liveness marks are applied using the parentage chains. > > > > [1] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@xxxxxxxxxxxxxx/ > > > > Signed-off-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > > --- > > Documentation/bpf/verifier.rst | 280 +++++++++++++++++++++++++++++++++ > > 1 file changed, 280 insertions(+) > ... > > + Current +-------------------------------+ > > + state | r0 | r1-r5 | r6-r9 | fp-8 ... | > > + +-------------------------------+ > > + \ > > + r6 read mark is propagated via > > + these links all the way up to > > + checkpoint #1. Hi Edward, could you please review the updates below? > Perhaps explicitly mention here that the reason it doesn't > propagate to checkpoint #0 (despite the arrow) is that there's > a write mark on c1[r6]. I can update this remark as follows: ---- 8< --------------------------- Current +-------------------------------+ state | r0 | r1-r5 | r6-r9 | fp-8 ... | +-------------------------------+ \ r6 read mark is propagated via these links all the way up to checkpoint #1. The checkpoint #1 contains a write mark for r6 because of instruction (1), thus read propagation does not reach checkpoint #0 (see section below). --------------------------- >8 ---- > > Also worth mentioning somewhere that write marks are really a > property of the arrow, not the state — a write mark in c#1 tells > us that the straight-line code from c#0 to c#1 contains a write > (thus 'breaking' that arrow for read mark propagation); it lives > in c#1's data structures because it's c#1 that needs to 'know' > about it, whereas c#0 (and its parents) need to 'know' about any > *reads* in the straight-line code from c#0 to c#1 (but these are > of no interest to c#1). > I sometimes summarise this with the phrase "read up, write down", > though idk how useful that is to anyone outside of my head ;) TBH, I'm a bit hesitant to put such note on the diagram because liveness tracking algorithm is not yet discussed. I've updated the next section a bit to reflect this, please see below. > > > +Liveness marks tracking > > +======================= > > + > > +For each processed instruction, the verifier propagates information about reads > > +up the parentage chain and saves information about writes in the current state. > > +The information about reads is propagated by function ``mark_reg_read()`` which > > +could be summarized as follows:: > > + > > + mark_reg_read(struct bpf_reg_state *state): > > + parent = state->parent > > + while parent: > > + if parent->live & REG_LIVE_WRITTEN: > > + break > This isn't correct; we look at `state->live` here, because if in > the straight-line code since the last checkpoint (parent) > there's a write to this register, then reads should not > propagate to `parent`. You are correct, thank you for catching it (:big facepalm image:). > Then there's the complication of the `writes` variable in > mark_reg_read(); that's explained by the comment on > propagate_liveness(), which AFAICT you don't cover in your > doc section about that. (And note that `writes` is only ever > false for the first iteration through the mark_reg_read() loop). I intentionally avoided description of this mechanics to keep some balance between clarity and level of details. Added a note that there is some additional logic. > > > + if parent->live & REG_LIVE_READ64: > > + break > > + parent->live |= REG_LIVE_READ64 > > + state = parent > > + parent = state->parent > > + > > +Note: details about REG_LIVE_READ32 are omitted. > > + > > +Also note: the read marks are applied to the **parent** state while write marks > > +are applied to the **current** state. > May be worth stating that the principle of the algorithm is that > read marks propagate back along the chain until they hit a write > mark, which 'screens off' earlier states from the read. > Your doc implies this but afaict never states it explicitly, and > I think it makes the algo easier to understand for someone who > doesn't already know what it's all about. All in all here is updated start of the section: ---- 8< --------------------------- The principle of the algorithm is that read marks propagate back along the state parentage chain until they hit a write mark, which 'screens off' earlier states from the read. The information about reads is propagated by function ``mark_reg_read()`` which could be summarized as follows:: mark_reg_read(struct bpf_reg_state *state, ...): parent = state->parent while parent: if state->live & REG_LIVE_WRITTEN: break if parent->live & REG_LIVE_READ64: break parent->live |= REG_LIVE_READ64 state = parent parent = state->parent Notes: * The read marks are applied to the **parent** state while write marks are applied to the **current** state. The write mark on a register or stack slot means that it is updated by some instruction verified within current state. * Details about REG_LIVE_READ32 are omitted. * Function ``propagate_liveness()`` (see section :ref:`Read marks propagation for cache hits`) might override the first parent link, please refer to the comments in the source code for further details. --------------------------- >8 ---- > > Apart from that, this is great. I particularly like your diagram > of the parentage chains. Thanks a lot for commenting! wdyt about my updates?