Please let me know if there is a better forum for this type of question or if there is someone in particular I should address this to. If this is a known issue, I would not mind taking a stab at fixing it and would appreciate pointers on what needs to be done. Yours sincerely, Josh On Mon, Dec 4, 2023 at 3:41 PM Joshua Coutinho <souichirosano@xxxxxxxxx> wrote: > > Hi All, > > I'm trying to transmit udp packets via an xsk socket. (XDP receives > work just fine). The xdp program is irrelevant/unused, I'm trying to > simply leverage the xsk socket write. > > (Kernel: Linux fedora 6.5.12-300.fc39.x86_64) > (OS: "Fedora Linux 39 (Workstation Edition)" > > I want a minimal working example of sending packets via an XSK socket > over loopback in user space land. I want to be able to fill in the > required memory regions and trigger the kernel to send the packet and > capture the sent packets on the other side with 'nc -lu 127.0.0.1 > <port>' This seems to happen partly successfully but on the ingress > part of the loopback it is dropped somewhere after reaching the kernel > function ip_rcv and then nf_hook_slow. Specifically, I simply want to > write a packet into a UMEM region, fill in the TX descriptor and then > submit that descriptor like so. > > u32 txIdx = -1; > const u32 txSlotsRecvd = xsk_ring_prod__reserve(&qs.txQ, 1, &txIdx); > u32 addr = umem.txState.nextSlot(); > > xdp_desc* txDescr = xsk_ring_prod__tx_desc(&qs.txQ, txIdx); > txDescr->addr = addr; > txDescr->len = sizeof(OrderFrame); > txDescr->options = 0; > > u8* outputBuf = umem.buffer + addr; > > TimeNs submitTime = currentTimeNs(); > OrderFrame& frame = *reinterpret_cast<OrderFrame *>(outputBuf); > > std::array<u8, ETH_ALEN> sourceMac = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0}; > std::array<u8, ETH_ALEN> destMac = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0}; > std::copy(sourceMac.begin(), sourceMac.end(), frame.eth.h_source); > std::copy(destMac.begin(), destMac.end(), frame.eth.h_dest); > > frame.eth.h_proto = htons(ETH_P_IP); > frame.ip.ihl = 5; > frame.ip.version = 4; > frame.ip.tos = 0; > frame.ip.tot_len = htons(sizeof(OrderFrame) - sizeof(ethhdr)); > frame.ip.id = orderId; > > frame.ip.frag_off = 0x0; > frame.ip.ttl = static_cast<u8>(255); > frame.ip.protocol = 17; > frame.ip.check = 0; > constexpr u8 sourceIPBytes[4] = {127, 0, 0, 1}; > constexpr u8 destIPBytes[4] = {127, 0, 0, 1}; > const u32 sourceIP = *reinterpret_cast<const u32*>(sourceIPBytes); > const u32 destIP = *reinterpret_cast<const u32*>(destIPBytes); > frame.ip.saddr = sourceIP; > frame.ip.daddr = destIP; > const u8* dataptr = reinterpret_cast<u8 *>(&frame.ip); > const u16 kernelcsum = ip_fast_csum(dataptr, frame.ip.ihl); > frame.ip.check = kernelcsum; > constexpr int udpPacketSz = sizeof(OrderFrame) - > sizeof(ethhdr) - sizeof(iphdr); > frame.udp.len = htons(udpPacketSz); > frame.udp.check = 0; > frame.udp.dest = htons(OE_PORT); > frame.udp.source = htons(1234); > ... // application packet logic > frame.udp.check = 0; > > xsk_ring_prod__submit(&qs.txQ, 1); > if (xsk_ring_prod__needs_wakeup(&qs.txQ)) { > assert((socket.cfg.bind_flags & XDP_COPY) != 0); > const ssize_t ret = sendto(socket.xskFD, nullptr, 0, > MSG_DONTWAIT, nullptr, 0); > } > > This is a relevant stacktrace from the kernel indicating the path of > my packet after the above sendto is called. > > __netif_receive_skb_one_core+0x3c/0xa0 > process_backlog+0x85/0x120 > __napi_poll+0x28/0x1b0 > net_rx_action+0x2a4/0x380 > __do_softirq+0xd1/0x2c8 > do_softirq.part.0+0x3d/0x60 > __local_bh_enable_ip+0x68/0x70 > __dev_direct_xmit+0x152/0x210 > __xsk_generic_xmit+0x3e4/0x710 > xsk_sendmsg+0x12f/0x1f0 > __sys_sendto+0x1d6/0x1e0 > __x64_sys_sendto+0x24/0x30 > do_syscall_64+0x5d/0x90 > entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > > My socket is bound to localhost using xdpgeneric. I see the > transmitted packets in tcpdump and via bpftrace I see that ip_rcv is > invoked for the packets. nf_hook_slow is also invoked with 1 active > prerouting hook. On kfree_skb I see the reason for the drop is 'reason > not specified'. Examining the packet in tcpdump I see no errors with > the checksums or packet lengths and ports. Listeners for the > corresponding udp ports never receieve the packets.This is how I > create my socket > > cfg.rx_size = XSKQueues::NUM_READ_DESC; > cfg.tx_size = XSKQueues::NUM_WRITE_DESC; > cfg.libxdp_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD; > cfg.xdp_flags = XDP_FLAGS_SKB_MODE; > cfg.bind_flags = XDP_USE_NEED_WAKEUP | XDP_COPY; > > if (xsk_socket__create(&socket, iface.c_str(), QUEUE, umem.umem, > &qs.rxQ, &qs.txQ, &cfg)) { > perror("XSK: "); > exit(EXIT_FAILURE); > } > > What could be the issue?