On 8/9/22 10:02 AM, Alexei Starovoitov wrote:
On Sun, Aug 7, 2022 at 10:51 AM Yonghong Song <yhs@xxxxxx> wrote:
Tejun reported a bpf program kfunc return value mis-handling which
may cause incorrect result. The following is an example to show
the problem.
$ cat t.c
unsigned char bar();
int foo() {
if (bar() != 10) return 0; else return 1;
}
$ clang -target bpf -O2 -c t.c
$ llvm-objdump -d t.o
...
0000000000000000 <foo>:
0: 85 10 00 00 ff ff ff ff call -1
1: bf 01 00 00 00 00 00 00 r1 = r0
2: b7 00 00 00 01 00 00 00 r0 = 1
3: 15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB0_2>
4: b7 00 00 00 00 00 00 00 r0 = 0
0000000000000028 <LBB0_2>:
5: 95 00 00 00 00 00 00 00 exit
$
In the above example, the return type for bar() is 'unsigned char'.
But in the disassembly code, the whole register 'r1' is used to
compare to 10 without truncating upper 56 bits.
If function bar() is implemented as a bpf function, everything
should be okay since bpf ABI will make sure the caller do
proper truncation of upper 56 bits.
But if function bar() is implemented as a non-bpf kfunc,
there could a mismatch between bar() implementation and bpf program.
For example, if the host arch is x86_64, the bar() function
may just put the return value in lower 8-bit subregister and all
upper 56 bits could contain garbage. This is not a problem
if bar() is called in x86_64 context as the caller will use
%al to get the value.
But this could be a problem if bar() is called in bpf context
and there is a mismatch expectation between bpf and native architecture.
Currently, bpf programs use the default llvm ABI ([1], function
isPromotableIntegerTypeForABI()) such that if an integer type size
is less than int type size, it is assumed proper sign or zero
extension has been done to the return value. There will be a problem
if the kfunc return value type is u8/s8/u16/s16.
This patch intends to address this issue by doing proper sign or zero
extension for the kfunc return value before it is used later.
[1] https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/TargetInfo.cpp
Reported-by: Tejun Heo <tj@xxxxxxxxxx>
Signed-off-by: Yonghong Song <yhs@xxxxxx>
---
include/linux/bpf.h | 2 ++
kernel/bpf/btf.c | 9 +++++++++
kernel/bpf/verifier.c | 35 +++++++++++++++++++++++++++++++++--
3 files changed, 44 insertions(+), 2 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 20c26aed7896..b6f6bb1b707d 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -727,6 +727,8 @@ enum bpf_cgroup_storage_type {
#define MAX_BPF_FUNC_REG_ARGS 5
struct btf_func_model {
+ u8 ret_integer:1;
+ u8 ret_integer_signed:1;
u8 ret_size;
u8 nr_args;
u8 arg_size[MAX_BPF_FUNC_ARGS];
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 8119dc3994db..f30a02018701 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5897,6 +5897,7 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
u32 i, nargs;
int ret;
+ m->ret_integer = false;
if (!func) {
/* BTF function prototype doesn't match the verifier types.
* Fall back to MAX_BPF_FUNC_REG_ARGS u64 args.
@@ -5923,6 +5924,14 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
return -EINVAL;
}
m->ret_size = ret;
+ if (btf_type_is_int(t)) {
+ m->ret_integer = true;
+ /* BTF_INT_BOOL is considered as unsigned */
+ if (BTF_INT_ENCODING(btf_type_int(t)) == BTF_INT_SIGNED)
+ m->ret_integer_signed = true;
+ else
+ m->ret_integer_signed = false;
+ }
for (i = 0; i < nargs; i++) {
if (i == nargs - 1 && args[i].type == 0) {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 096fdac70165..684f8606f341 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13834,8 +13834,9 @@ static int fixup_call_args(struct bpf_verifier_env *env)
}
static int fixup_kfunc_call(struct bpf_verifier_env *env,
- struct bpf_insn *insn)
+ struct bpf_insn *insn, struct bpf_insn *insn_buf, int *cnt)
{
+ u8 ret_size, shift_cnt, rshift_opcode;
const struct bpf_kfunc_desc *desc;
if (!insn->imm) {
@@ -13855,6 +13856,26 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env,
insn->imm = desc->imm;
+ *cnt = 0;
+ ret_size = desc->func_model.ret_size;
+
+ /* If the kfunc return type is an integer and the type size is one byte or two
+ * bytes, currently llvm/bpf assumes proper sign/zero extension has been done
+ * in the caller. But such an asumption may not hold for non-bpf architectures.
+ * For example, for x86_64, if the return type is 'u8', it is possible that only
+ * %al register is set properly and upper 56 bits of %rax register may contain
+ * garbage. To resolve this case, Let us do a necessary truncation to zero-out
+ * or properly sign-extend upper 56 bits.
+ */
+ if (desc->func_model.ret_integer && ret_size < sizeof(int)) {
Few questions...
Do we really need 'ret_integer' here?
and is it x86 specific?
afaik only x86 has 8 and 16-bit subregisters.
On all other archs the hw cannot write such quantities into
a register and don't touch the upper bits.
At the same time such return values from kfunc are rare.
I don't think we have such a case in the current set of kfuncs.
So being safe than sorry is a reasonable trade off and
gating by x86 only is unnecessary.
So how about just if (ret_size < sizeof(int)) here?
good questions. yes, we don't need ret_integer here with the
current code base.
I added ret_integer because my kfunc struct argument/return value
support work (in RFC stage). In that case, we could have
a 1-byte or 2-bytes structure as return value in which case,
we should not do sign/extension. With checking ret_integer,
we can avoid the churn later.
But it is not clear whether we will support kfunc return struct
as we don't have a request so far. So I will remove ret_integer
in the next revision.
+ shift_cnt = (sizeof(u64) - ret_size) * 8;
+ rshift_opcode = desc->func_model.ret_integer_signed ? BPF_ARSH : BPF_RSH;
+ insn_buf[0] = *insn;
+ insn_buf[1] = BPF_ALU64_IMM(BPF_LSH, BPF_REG_0, shift_cnt);
+ insn_buf[2] = BPF_ALU64_IMM(rshift_opcode, BPF_REG_0, shift_cnt);
+ *cnt = 3;
+ }
+
return 0;
}
@@ -13996,9 +14017,19 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
if (insn->src_reg == BPF_PSEUDO_CALL)
continue;
if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
- ret = fixup_kfunc_call(env, insn);
+ ret = fixup_kfunc_call(env, insn, insn_buf, &cnt);
if (ret)
return ret;
+ if (cnt == 0)
+ continue;
+
+ new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
+ if (!new_prog)
+ return -ENOMEM;
+
+ delta += cnt - 1;
+ env->prog = prog = new_prog;
+ insn = new_prog->insnsi + i + delta;
continue;
}
--
2.30.2