On Fri, Jan 15, 2021 at 1:41 AM Gary Lin <glin@xxxxxxxx> wrote: > > On Thu, Jan 14, 2021 at 10:37:33PM -0800, Alexei Starovoitov wrote: > > On Thu, Jan 14, 2021 at 1:54 AM Gary Lin <glin@xxxxxxxx> wrote: > > > * pass to emit the final image. > > > */ > > > - for (pass = 0; pass < 20 || image; pass++) { > > > - proglen = do_jit(prog, addrs, image, oldproglen, &ctx); > > > + for (pass = 0; pass < MAX_PASSES || image; pass++) { > > > + if (!padding && pass >= PADDING_PASSES) > > > + padding = true; > > > + proglen = do_jit(prog, addrs, image, oldproglen, &ctx, padding); > > > > I'm struggling to reconcile the discussion we had before holidays with > > the discussion you guys had in v2: > > > > >> What is the rationale for the latter when JIT is called again for subprog to fill in relative > > >> call locations? > > >> > > > Hmmmm, my thinking was that we only enable padding for those programs > > > which are already padded before. But, you're right. For the programs > > > converging without padding, enabling padding won't change the final > > > image, so it's safe to always set "padding" to true for the extra pass. > > > > > > Will remove the "padded" flag in v3. > > > > I'm not following why "enabling padding won't change the final image" > > is correct. > > Say the subprog image converges without padding. > > Then for subprog we call JIT again. > > Now extra_pass==true and padding==true. > > The JITed image will be different. > Actually no. > > > The test in patch 3 should have caught it, but it didn't, > > because it checks for a subprog that needed padding. > > The extra_pass needs to emit insns exactly in the right spots. > > Otherwise jump targets will be incorrect. > > The saved addrs[] array is crucial. > > If extra_pass emits different things the instruction starts won't align > > to places where addrs[] expects them to be. > > > When calculating padding bytes, if the image already converges, the > emitted instruction size just matches (addrs[i] - addrs[i-1]), so > emit_nops() emits 0 byte, and the image doesn't change. I see. You're right. That's very tricky. The patch set doesn't apply cleanly. Could you please rebase and add a detailed comment about this logic? Also please add comments why we check: nops != 0 && nops != 4 nops != 0 && nops != 2 && nops != 5 nops != 0 && nops != 3 None of it is obvious. Does your single test cover all combinations of numbers?