Re: [PATCH bpf-next] bpf, x64: Fix a jit convergence issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 8/27/24 7:24 PM, Alexei Starovoitov wrote:
On Sun, Aug 25, 2024 at 1:04 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
Daniel Hodges reported a jit error when playing with a sched-ext
program. The error message is:
   unexpected jmp_cond padding: -4 bytes

But further investigation shows the error is actual due to failed
convergence. The following are some analysis:

   ...
   pass4, final_proglen=4391:
     ...
     20e:    48 85 ff                test   rdi,rdi
     211:    74 7d                   je     0x290
     213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
     ...
     289:    48 85 ff                test   rdi,rdi
     28c:    74 17                   je     0x2a5
     28e:    e9 7f ff ff ff          jmp    0x212
     293:    bf 03 00 00 00          mov    edi,0x3

Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
and insn at 0x28e is 5-byte jmp insn with offset -129.

   pass5, final_proglen=4392:
     ...
     20e:    48 85 ff                test   rdi,rdi
     211:    0f 84 80 00 00 00       je     0x297
     217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
     ...
     28d:    48 85 ff                test   rdi,rdi
     290:    74 1a                   je     0x2ac
     292:    eb 84                   jmp    0x218
     294:    bf 03 00 00 00          mov    edi,0x3

Note that insn at 0x211 is 5-byte cond jump insn now since its offset
becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
At the same time, insn at 0x292 is a 2-byte insn since its offset is
-124.

pass6 will repeat the same code as in pass4. pass7 will repeat the same
code as in pass5, and so on. This will prevent eventual convergence.

Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
insn looks like:

     211:    0f 84 80 00 00 00       je     0x297
     217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
     ...
     24d:    48 85 d2                test   rdx,rdx

The similar code in pass14:
     211:    74 7d                   je     0x290
     213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
     ...
     249:    48 85 d2                test   rdx,rdx
     24c:    74 21                   je     0x26f
     24e:    48 01 f7                add    rdi,rsi
     ...

Before generating the following insn,
   250:    74 21                   je     0x273
"padding = 1" enables some checking to ensure nops is either 0 or 4
where
   #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
   nops = INSN_SZ_DIFF - 2

In this specific case,
   addrs[i] = 0x24e // from pass14
   addrs[i-1] = 0x24d // from pass15
   prog - temp = 3 // from 'test rdx,rdx' in pass15
so
   nops = -4
and this triggers the failure.
Making jit prog convergable can fix the above error.

Reported-by: Daniel Hodges <hodgesd@xxxxxxxx>
Signed-off-by: Yonghong Song <yonghong.song@xxxxxxxxx>
---
  arch/x86/net/bpf_jit_comp.c | 47 ++++++++++++++++++++++++++++++++++++-
  1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 074b41fafbe3..ec541aae5d9b 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -64,6 +64,51 @@ static bool is_imm8(int value)
         return value <= 127 && value >= -128;
  }

+/*
+ * Let us limit the positive offset to be <= 124.
+ * This is to ensure eventual jit convergence For the following patterns:
+ * ...
+ * pass4, final_proglen=4391:
+ *   ...
+ *   20e:    48 85 ff                test   rdi,rdi
+ *   211:    74 7d                   je     0x290
+ *   213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
+ *   ...
+ *   289:    48 85 ff                test   rdi,rdi
+ *   28c:    74 17                   je     0x2a5
+ *   28e:    e9 7f ff ff ff          jmp    0x212
+ *   293:    bf 03 00 00 00          mov    edi,0x3
+ * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
+ * and insn at 0x28e is 5-byte jmp insn with offset -129.
+ *
+ * pass5, final_proglen=4392:
+ *   ...
+ *   20e:    48 85 ff                test   rdi,rdi
+ *   211:    0f 84 80 00 00 00       je     0x297
+ *   217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
+ *   ...
+ *   28d:    48 85 ff                test   rdi,rdi
+ *   290:    74 1a                   je     0x2ac
+ *   292:    eb 84                   jmp    0x218
+ *   294:    bf 03 00 00 00          mov    edi,0x3
+ * Note that insn at 0x211 is 5-byte cond jump insn now since its offset
+ * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
+ * At the same time, insn at 0x292 is a 2-byte insn since its offset is
+ * -124.
+ *
+ * pass6 will repeat the same code as in pass4 and this will prevent
+ * eventual convergence.
+ *
+ * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes)
+ * cycle in the above. Let us limit the positive offset for 8bit cond jump
+ * insn to mamximum 124 (0x7c). This way, the jmp insn will be always 2-bytes,
+ * and the jit pass can eventually converge.
+ */
je<->jmp

It can be je/je too, no?

Yes. It is possible.


so 128 - 4 instead of 128 - 3 ?

You probably mean "127 - 4 instead of 127 - 3" since
the maximum value is 127.

I checked 127 - 4 = 0x7c and indeed we should. See below examples:

   20e:    48 85 ff                test   rdi,rdi
   211:    XX XX                   je     0x291
   213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
   ...
   28d:    XX XX XX XX XX XX       je     0x212
   293:    bf 03 00 00 00          mov    edi,0x3

=>

   20e:    48 85 ff                test   rdi,rdi
   211:    XX XX XX XX XX XX       je     0x297 (0x293 - 0x213)
   217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
   ...
   291:    XX XX                   je     0x217 (0x217 - 0x293)
   293:    bf 03 00 00 00          mov    edi,0x3

=>

   20e:    48 85 ff                test   rdi,rdi
   211:    XX XX                   je     0x28f (0x293 - 0x217)
   213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
   ...
   28d:    XX XX                   je     0x213 (0x213 - 0x293)  // -0x80 allowed
   293:    bf 03 00 00 00          mov    edi,0x3

=>

   20e:    48 85 ff                test   rdi,rdi
   211:    XX XX XX XX XX XX       je     0x28f (0x293 - 0x213)
   217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
   ...
   291:    XX XX                   je     0x217 (0x217 - 0x293)
   293:    bf 03 00 00 00          mov    edi,0x3

=>
   ...


Here 0x293 - 0x217 = 0x7c


+static bool is_imm8_cond_offset(int value)
+{
+       return value <= 124 && value >= -128;
the other side needs the same treatment, no ?

good question. From my understanding, the non-convergence in the
above needs both forward and backport conditions. The solution we
are using is based on putting a limitation on forward conditions
w.r.t. jit code gen.

Another solution is actually to put a limitation on backward
conditions. For example, let us say the above is_imm8_cond_offset()
has
	return value <= 127 && value > -124

See below example:

   20e:    48 85 ff                test   rdi,rdi
   211:    XX XX                   je     0x291
   213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
   ...
   28d:    XX XX XX XX XX XX       je     0x212
   293:    bf 03 00 00 00          mov    edi,0x3

=>

   20e:    48 85 ff                test   rdi,rdi
   211:    XX XX XX XX XX XX       je     0x297 (0x293 - 0x213)
   217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
   ...
   291:    XX XX XX XX XX XX       je     0x21b (0x217 - 0x293)
   297:    bf 03 00 00 00          mov    edi,0x3
=>

   20e:    48 85 ff                test   rdi,rdi
   211:    XX XX XX XX XX XX       je     0x297 (0x297 - 0x217)
   217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
   ...
   291:    XX XX XX XX XX XX       je     0x217 (0x217 - 0x297)
   297:    bf 03 00 00 00          mov    edi,0x3

converged here.

So I think we do not need to limit both sides. One side should be enough.


+}
+
  static bool is_simm32(s64 value)
  {
         return value == (s64)(s32)value;
@@ -2231,7 +2276,7 @@ st:                       if (is_imm8(insn->off))
                                 return -EFAULT;
                         }
                         jmp_offset = addrs[i + insn->off] - addrs[i];
-                       if (is_imm8(jmp_offset)) {
+                       if (is_imm8_cond_offset(jmp_offset)) {
                                 if (jmp_padding) {
                                         /* To keep the jmp_offset valid, the extra bytes are
                                          * padded before the jump insn, so we subtract the
--
2.43.5





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux