On Wed, Jul 17, 2019 at 3:36 AM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> wrote: > > > Am 17.07.2019 um 11:21 schrieb Ilya Leoshkevich <iii@xxxxxxxxxxxxx>: > > > >> Am 17.07.2019 um 07:11 schrieb Y Song <ys114321@xxxxxxxxx>: > >> > >> [sorry, resend again as previous one has come text messed out due to > >> networking issues] > >> > >> On Tue, Jul 16, 2019 at 10:08 PM Y Song <ys114321@xxxxxxxxx> wrote: > >>> > >>> On Tue, Jul 16, 2019 at 4:59 AM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> wrote: > >>>> > >>>> test_pkt_md_access is failing on s390, since the associated eBPF prog > >>>> returns TC_ACT_SHOT, which in turn happens because loading a part of a > >>>> struct __sk_buff field produces an incorrect result. > >>>> > >>>> The problem is that when verifier emits the code to replace partial load > >>>> of a field with a full load, a shift and a bitwise AND, it assumes that > >>>> the machine is little endian. > >>>> > >>>> Adjust shift count calculation to account for endianness. > >>>> > >>>> Fixes: 31fd85816dbe ("bpf: permits narrower load from bpf program context fields") > >>>> Signed-off-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > >>>> --- > >>>> kernel/bpf/verifier.c | 8 ++++++-- > >>>> 1 file changed, 6 insertions(+), 2 deletions(-) > >>>> > >>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > >>>> index 5900cbb966b1..3f9353653558 100644 > >>>> --- a/kernel/bpf/verifier.c > >>>> +++ b/kernel/bpf/verifier.c > >>>> @@ -8616,8 +8616,12 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) > >>>> } > >>>> > >>>> if (is_narrower_load && size < target_size) { > >>>> - u8 shift = (off & (size_default - 1)) * 8; > >>>> - > >>>> + u8 load_off = off & (size_default - 1); > >>>> +#ifdef __LITTLE_ENDIAN > >>>> + u8 shift = load_off * 8; > >>>> +#else > >>>> + u8 shift = (size_default - (load_off + size)) * 8; > >>>> +#endif > >>> > >> All the values are in register. The shifting operations should be the > >> same for big endian and little endian, e.g., value 64 >> 2 = 16 when > >> value "64" is in register. So I did not see a problem here. > >> > >> Could you elaborate which field access in test_pkt_md_access > >> caused problem? > > > > The very first one: TEST_FIELD(__u8, len, 0xFF); > > > >> It would be good if you can give detailed memory layout and register values > >> to illustrate the problem. > > > > Suppose len = 0x11223344. On a big endian system, this would be > > > > 11 22 33 44 > > > > Now, we would like to do *(u8 *)&len, the desired result is 0x11. > > Verifier should emit the following: ((*(u32 *)&len) >> 24) & 0xff, but as > > of today it misses the shift. > > > > On a little endian system the layout is: > > > > 44 33 22 11 > > > > and the desired result is different - 0x44. Verifier correctly emits > > (*(u32 *)&len) & 0xff. > > I’ve just realized, that this example does not reflect what the test is > doing on big-endian systems (there is an #ifdef for those). > > Here is a better one: len=0x11223344 and we would like to do > ((u8 *)&len)[3]. > > len is represented as `11 22 33 44` in memory, so the desired result is > 0x44. It can be obtained by doing (*(u32 *)&len) & 0xff, but today the > verifier does ((*(u32 *)&len) >> 24) & 0xff instead. What you described above for the memory layout all makes sense. The root cause is for big endian, we should do *((u8 *)&len + 3). This is exactly what macros in test_pkt_md_access.c tries to do. if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ #define TEST_FIELD(TYPE, FIELD, MASK) \ { \ TYPE tmp = *(volatile TYPE *)&skb->FIELD; \ if (tmp != ((*(volatile __u32 *)&skb->FIELD) & MASK)) \ return TC_ACT_SHOT; \ } #else #define TEST_FIELD_OFFSET(a, b) ((sizeof(a) - sizeof(b)) / sizeof(b)) #define TEST_FIELD(TYPE, FIELD, MASK) \ { \ TYPE tmp = *((volatile TYPE *)&skb->FIELD + \ TEST_FIELD_OFFSET(skb->FIELD, TYPE)); \ if (tmp != ((*(volatile __u32 *)&skb->FIELD) & MASK)) \ return TC_ACT_SHOT; \ } #endif Could you check whether your __BYTE_ORDER__ is set correctly or not for this case? You may need to tweak Makefile if you are doing cross compilation, I am not sure how as I did not have environment.