On Thu, Apr 11, 2019 at 5:53 PM Jiong Wang <jiong.wang@xxxxxxxxxxxxx> wrote: > > > > On 11 Apr 2019, at 17:44, Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx> wrote: > > > > On Thu, 11 Apr 2019 07:13:03 +0100, Jiong Wang wrote: > >>>> @@ -1150,17 +1150,17 @@ static int mark_reg_read(struct bpf_verifier_env *env, > >>>> parent->var_off.value, parent->off); > >>>> return -EFAULT; > >>>> } > >>>> - if (parent->live & REG_LIVE_READ) > >>>> + if ((parent->live & REG_LIVE_READ) == flags) > >>>> /* The parentage chain never changes and > >>>> - * this parent was already marked as LIVE_READ. > >>>> + * this parent was already marked with all read bits. > >>> > >>> Do we have to propagate all read bits? Read64 is strictly stronger > >>> than read32, as long as read64 is set on the parent we should be good? > >> > >> We should be good, but I doubt there is value to differentiate on this in this > >> kind of HOT function. > > > > The entire if clause is an optimization. I'm saying you can maintain it > > as more aggressive. > > What I mean is I suspect the read width could be quite consistent in real program, > the percentage for doing extra check on read64 could actually be mishit for > most of the time, if the propagation iterations is big the extra check done each > time may overcome the propagation pruned. > > I will do some benchmarking on this to see the real gain. Take bpf_lxc.o for example, it has ~3 million mark_reg_read propagation iteration. Adding extra (parent->live & REG_LIVE_READ64) check cuts off 18% ~ 25% propagation iterations in exchange of 75% iterations are doing one more check (parent->live & REG_LIVE_READ64). Given the propagation contains several statement, I think we are better off adding such check (also done some ktime_get_real_ts64 run timing measure, but the results is not very consistent). Will add it in v3. Regards, Jiong