On 7/12/21 4:38 PM, Vincent Li wrote:
Hi Yonghong,
On Fri, Jun 18, 2021 at 12:58 PM Vincent Li <vincent.mc.li@xxxxxxxxx> wrote:
Hi Yonghong,
I attached the full verifier log and BPF bytecode just in case it is
obvious to you, if it is not, that is ok. I tried to make sense out of
it and I failed due to my limited knowledge about BPF :)
I followed your clue on investigating how fp-200=pkt changed to
fp-200=inv in https://github.com/cilium/cilium/issues/16517#issuecomment-873522146
with previous attached complete bpf verifier log and bpf bytecode, it
eventually comes to following
0000000000004948 :
2345: bf a3 00 00 00 00 00 00 r3 = r10
2346: 07 03 00 00 d0 ff ff ff r3 += -48
2347: b7 08 00 00 06 00 00 00 r8 = 6
; return ctx_store_bytes(ctx, off, mac, ETH_ALEN, 0);
2348: bf 61 00 00 00 00 00 00 r1 = r6
2349: b7 02 00 00 00 00 00 00 r2 = 0
2350: b7 04 00 00 06 00 00 00 r4 = 6
2351: b7 05 00 00 00 00 00 00 r5 = 0
2352: 85 00 00 00 09 00 00 00 call 9
2353: 67 00 00 00 20 00 00 00 r0 <<= 32
2354: c7 00 00 00 20 00 00 00 r0 s>>= 32
; if (eth_store_daddr(ctx, (__u8 *) &vtep_mac.addr, 0) < 0)
2355: c5 00 54 00 00 00 00 00 if r0 s< 0 goto +84
my new code is eth_store_daddr(ctx, (__u8 *) &vtep_mac.addr, 0) < 0;
that is what i copied from other part of cilium code, eth_store_daddr
is:
static __always_inline int eth_store_daddr(struct __ctx_buff *ctx,
const __u8 *mac, int off)
{
#if !CTX_DIRECT_WRITE_OK
return eth_store_daddr_aligned(ctx, mac, off);
#else
......
}
and eth_store_daddr_aligned is
static __always_inline int eth_store_daddr_aligned(struct __ctx_buff *ctx,
const __u8 *mac, int off)
{
return ctx_store_bytes(ctx, off, mac, ETH_ALEN, 0);
}
Joe from Cilium raised an interesting question on why the compiler
put ctx_store_bytes() before if (eth_store_daddr(ctx, (__u8 *)
&vtep_mac.addr, 0) < 0),
that seems to have fp-200=pkt changed to fp-200=inv, and indeed if I
skip the eth_store_daddr_aligned call, the issue is resolved, do you
have clue on why compiler does that?
This is expected. After inlining, you got
if (ctx_store_bytes(...) < 0) ...
So you need to do
ctx_store_bytes(...)
first and then do the if condition.
Looking at the issue at https://github.com/cilium/cilium/issues/16517,
the reason seems due to xdp_store_bytes/skb_store_bytes.
When these helpers write some data into the stack based buffer, they
invalidate some stack contents. I don't know whether it is a false
postive case or not, i.e., the verifier invalidates the wrong stack
location conservatively. This needs further investigation.
I have more follow-up in https://github.com/cilium/cilium/issues/16517
if you are interested to know the full picture.
Appreciate it very much if you have time to look at it :)
Vincent