On Wed, Jan 12, 2022 at 11:03 PM <menglong8.dong@xxxxxxxxx> wrote: > > From: Menglong Dong <imagedong@xxxxxxxxxxx> > > The description of 'dst_port' in 'struct bpf_sock' is not accurated. > In fact, 'dst_port' is not in network byte order, it is 'partly' in > network byte order. > > We can see it in bpf_sock_convert_ctx_access(): > > > case offsetof(struct bpf_sock, dst_port): > > *insn++ = BPF_LDX_MEM( > > BPF_FIELD_SIZEOF(struct sock_common, skc_dport), > > si->dst_reg, si->src_reg, > > bpf_target_off(struct sock_common, skc_dport, > > sizeof_field(struct sock_common, > > skc_dport), > > target_size)); > > It simply passes 'sock_common->skc_dport' to 'bpf_sock->dst_port', > which makes that the low 16-bits of 'dst_port' is equal to 'skc_port' > and is in network byte order, but the high 16-bites of 'dst_port' is > 0. And the actual port is 'bpf_ntohs((__u16)dst_port)', and > 'bpf_ntohl(dst_port)' is totally not the right port. > > This is different form 'remote_port' in 'struct bpf_sock_ops' or > 'struct __sk_buff': > > > case offsetof(struct __sk_buff, remote_port): > > BUILD_BUG_ON(sizeof_field(struct sock_common, skc_dport) != 2); > > > > *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, sk), > > si->dst_reg, si->src_reg, > > offsetof(struct sk_buff, sk)); > > *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg, > > bpf_target_off(struct sock_common, > > skc_dport, > > 2, target_size)); > > #ifndef __BIG_ENDIAN_BITFIELD > > *insn++ = BPF_ALU32_IMM(BPF_LSH, si->dst_reg, 16); > > #endif > > We can see that it will left move 16-bits in little endian, which makes > the whole 'remote_port' is in network byte order, and the actual port > is bpf_ntohl(remote_port). > > Note this in the document of 'dst_port'. ( Maybe this should be unified > in the code? ) Looks like __sk_buff->remote_port bpf_sock_ops->remote_port sk_msg_md->remote_port are doing the right thing, but bpf_sock->dst_port is not correct? I think it's better to fix it, but probably need to consolidate it with convert_ctx_accesses() that deals with narrow access. I suspect reading u8 from three flavors of 'remote_port' won't be correct. 'dst_port' works with a narrow load, but gets endianness wrong.