Patch "bpf, x64: Fix a jit convergence issue" has been added to the 6.6-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    bpf, x64: Fix a jit convergence issue

to the 6.6-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     bpf-x64-fix-a-jit-convergence-issue.patch
and it can be found in the queue-6.6 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 452900c50e312d0bb1d801518f916f57cb5eaadc
Author: Yonghong Song <yonghong.song@xxxxxxxxx>
Date:   Wed Sep 4 15:12:51 2024 -0700

    bpf, x64: Fix a jit convergence issue
    
    [ Upstream commit c8831bdbfbab672c006a18006d36932a494b2fd6 ]
    
    Daniel Hodges reported a jit error when playing with a sched-ext program.
    The error message is:
      unexpected jmp_cond padding: -4 bytes
    
    But further investigation shows the error is actual due to failed
    convergence. The following are some analysis:
    
      ...
      pass4, final_proglen=4391:
        ...
        20e:    48 85 ff                test   rdi,rdi
        211:    74 7d                   je     0x290
        213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
        ...
        289:    48 85 ff                test   rdi,rdi
        28c:    74 17                   je     0x2a5
        28e:    e9 7f ff ff ff          jmp    0x212
        293:    bf 03 00 00 00          mov    edi,0x3
    
    Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
    and insn at 0x28e is 5-byte jmp insn with offset -129.
    
      pass5, final_proglen=4392:
        ...
        20e:    48 85 ff                test   rdi,rdi
        211:    0f 84 80 00 00 00       je     0x297
        217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
        ...
        28d:    48 85 ff                test   rdi,rdi
        290:    74 1a                   je     0x2ac
        292:    eb 84                   jmp    0x218
        294:    bf 03 00 00 00          mov    edi,0x3
    
    Note that insn at 0x211 is 6-byte cond jump insn now since its offset
    becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). At the same
    time, insn at 0x292 is a 2-byte insn since its offset is -124.
    
    pass6 will repeat the same code as in pass4. pass7 will repeat the same
    code as in pass5, and so on. This will prevent eventual convergence.
    
    Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
    insn looks like:
    
        211:    0f 84 80 00 00 00       je     0x297
        217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
        ...
        24d:    48 85 d2                test   rdx,rdx
    
    The similar code in pass14:
        211:    74 7d                   je     0x290
        213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
        ...
        249:    48 85 d2                test   rdx,rdx
        24c:    74 21                   je     0x26f
        24e:    48 01 f7                add    rdi,rsi
        ...
    
    Before generating the following insn,
      250:    74 21                   je     0x273
    "padding = 1" enables some checking to ensure nops is either 0 or 4
    where
      #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
      nops = INSN_SZ_DIFF - 2
    
    In this specific case,
      addrs[i] = 0x24e // from pass14
      addrs[i-1] = 0x24d // from pass15
      prog - temp = 3 // from 'test rdx,rdx' in pass15
    so
      nops = -4
    and this triggers the failure.
    
    To fix the issue, we need to break cycles of je <-> jmp. For example,
    in the above case, we have
      211:    74 7d                   je     0x290
    the offset is 0x7d. If 2-byte je insn is generated only if
    the offset is less than 0x7d (<= 0x7c), the cycle can be
    break and we can achieve the convergence.
    
    I did some study on other cases like je <-> je, jmp <-> je and
    jmp <-> jmp which may cause cycles. Those cases are not from actual
    reproducible cases since it is pretty hard to construct a test case
    for them. the results show that the offset <= 0x7b (0x7b = 123) should
    be enough to cover all cases. This patch added a new helper to generate 8-bit
    cond/uncond jmp insns only if the offset range is [-128, 123].
    
    Reported-by: Daniel Hodges <hodgesd@xxxxxxxx>
    Signed-off-by: Yonghong Song <yonghong.song@xxxxxxxxx>
    Link: https://lore.kernel.org/r/20240904221251.37109-1-yonghong.song@xxxxxxxxx
    Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 878a4c6dd7565..a50c99e9b5c01 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -58,6 +58,56 @@ static bool is_imm8(int value)
 	return value <= 127 && value >= -128;
 }
 
+/*
+ * Let us limit the positive offset to be <= 123.
+ * This is to ensure eventual jit convergence For the following patterns:
+ * ...
+ * pass4, final_proglen=4391:
+ *   ...
+ *   20e:    48 85 ff                test   rdi,rdi
+ *   211:    74 7d                   je     0x290
+ *   213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
+ *   ...
+ *   289:    48 85 ff                test   rdi,rdi
+ *   28c:    74 17                   je     0x2a5
+ *   28e:    e9 7f ff ff ff          jmp    0x212
+ *   293:    bf 03 00 00 00          mov    edi,0x3
+ * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
+ * and insn at 0x28e is 5-byte jmp insn with offset -129.
+ *
+ * pass5, final_proglen=4392:
+ *   ...
+ *   20e:    48 85 ff                test   rdi,rdi
+ *   211:    0f 84 80 00 00 00       je     0x297
+ *   217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
+ *   ...
+ *   28d:    48 85 ff                test   rdi,rdi
+ *   290:    74 1a                   je     0x2ac
+ *   292:    eb 84                   jmp    0x218
+ *   294:    bf 03 00 00 00          mov    edi,0x3
+ * Note that insn at 0x211 is 6-byte cond jump insn now since its offset
+ * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
+ * At the same time, insn at 0x292 is a 2-byte insn since its offset is
+ * -124.
+ *
+ * pass6 will repeat the same code as in pass4 and this will prevent
+ * eventual convergence.
+ *
+ * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes)
+ * cycle in the above. In the above example je offset <= 0x7c should work.
+ *
+ * For other cases, je <-> je needs offset <= 0x7b to avoid no convergence
+ * issue. For jmp <-> je and jmp <-> jmp cases, jmp offset <= 0x7c should
+ * avoid no convergence issue.
+ *
+ * Overall, let us limit the positive offset for 8bit cond/uncond jmp insn
+ * to maximum 123 (0x7b). This way, the jit pass can eventually converge.
+ */
+static bool is_imm8_jmp_offset(int value)
+{
+	return value <= 123 && value >= -128;
+}
+
 static bool is_simm32(s64 value)
 {
 	return value == (s64)(s32)value;
@@ -1774,7 +1824,7 @@ st:			if (is_imm8(insn->off))
 				return -EFAULT;
 			}
 			jmp_offset = addrs[i + insn->off] - addrs[i];
-			if (is_imm8(jmp_offset)) {
+			if (is_imm8_jmp_offset(jmp_offset)) {
 				if (jmp_padding) {
 					/* To keep the jmp_offset valid, the extra bytes are
 					 * padded before the jump insn, so we subtract the
@@ -1856,7 +1906,7 @@ st:			if (is_imm8(insn->off))
 				break;
 			}
 emit_jmp:
-			if (is_imm8(jmp_offset)) {
+			if (is_imm8_jmp_offset(jmp_offset)) {
 				if (jmp_padding) {
 					/* To avoid breaking jmp_offset, the extra bytes
 					 * are padded before the actual jmp insn, so




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux