RX metadata kfuncs cause kernel panic with XDP generic mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey there,

while taking a closer look at how the RX metadata kfuncs are implemented in the mlx5 and ice drivers,
I suspected a bug and, after testing, could in fact produce a NULL pointer dereference.

The mlx5 driver implements the RX metadata kfuncs like, for example, bpf_xdp_metadata_rx_vlan_tag by
casting the xdp_md pointer from the function argument to an mlx5e_xdp_buff pointer. This is needed to
get access to the packet metadata. See mlx5e_xdp_rx_vlan_tag for example. The ice driver works similarly.

This is fine, because normally these drivers always create a full mlx5e_xdp_buff struct when allocating
the xdp_buff struct. But when a device-bound XDP program is attached to the mlx5 netdevice in generic mode,
the xdp_buff is not allocated by the mlx5 driver but as a part of the do_xdp_generic implementation.

Now, when a packet comes in and the XDP program tries to call one of these kfuncs, the kfunc implementation
will try to dereference pointers inside the mlx5e_xdp_buff struct which is not fully allocated, leading to a
NULL pointer dereference.

There is probably a check missing somewhere that prevents the use of these kfuncs in the scope of
do_xdp_generic? Or may there be another way to implement the RX metadata kfuncs in the driver that does not
involve casting the xdp_buff pointer?

Here is how this can be reproduced:


eBPF program:

#include <bpf.h>

extern int bpf_xdp_metadata_rx_vlan_tag(
    const struct xdp_md *ctx, __be16 *vlan_proto, __u16 *vlan_tci) __ksym;

SEC("xdp")
int ingress(struct xdp_md *ctx) {
  __be16 vlan_proto;
  __u16 vlan_tci;
  if (bpf_xdp_metadata_rx_vlan_tag(ctx, &vlan_proto, &vlan_tci) != 0) {
    return XDP_ABORTED;
  }

  return XDP_DROP;
}

char _license[] SEC("license") = "GPL";


Load and attach it as a device-bound program to a mlx5 NIC in XDP-generic mode:

# bpftool prog load crash.o /sys/fs/bpf/crash xdpmeta_dev mlx5-conx5-1
# bpftool net attach xdpgeneric pinned /sys/fs/bpf/crash dev mlx5-conx5-1

Then make sure a packet is coming in on that NIC port so the XDP program gets called:

# ping -I mlx5-conx5-2 1.1.1.1

In my testing environment, mlx5-conx5-2 and mlx5-conx5-1 are directly connected.

Kernel output:

Unable to handle kernel NULL pointer dereference at virtual address 000000000000001d
Mem abort info:
  ESR = 0x0000000096000004
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x04: level 0 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
  CM = 0, WnR = 0, TnD = 0, TagAccess = 0
  GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=000008035e557000
[000000000000001d] pgd=0000000000000000, p4d=0000000000000000

This was reproduced with Linux 6.12 mainline (adc2186).

--
Best regards,
Marcus Wichelmann
Linux Networking Specialist
Team SDN

______________________________

Hetzner Cloud GmbH
Feringastraße 12A
85774 Unterföhring
Germany

Phone: +49 89 381690 150
E-Mail: marcus.wichelmann@xxxxxxxxxxxxxxxx

Handelsregister München HRB 226782
Geschäftsführer: Sebastian Färber, Markus Kalmuk

------------------
For information on the processing of your personal
data in the context of this contact, please see
https://hetzner-cloud.de/datenschutz
------------------





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux