Re: [BUG?] loxilb tc BPF program cause Loongarch kernel hard lockup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 10, 2025 at 2:43 PM Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
>
> On Thu, Mar 06, 2025 at 08:04:14PM -0800, Vincent Li wrote:
> > Sorry I had a type error on the loongarch mailing list address, corrected it.
> >
> > On Thu, Mar 6, 2025 at 1:44 PM Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
> > >
> > > On Wed, Mar 05, 2025 at 04:51:15PM -0800, Vincent Li wrote:
> > > > Hi,
> > > >
> > > > I have an issue recorded here [0] with kernel call trace  when I start
> > > > loxilb, the loxilb tc BPF program seems to be loaded and attached to
> > > > the network interface, but immediately it causes a loongarch kernel
> > > > hard lockup, no keyboard response. Sometimes the panic call trace
> > > > shows up in the monitor screen after I disabled kernel panic reboot
> > > > (echo 0 > /proc/sys/kernel/panic) and started loxilb.
> > > >
> > > > Background: I ported open source IPFire [1] to Loongarch CPU
> > > > architecture and enabled kernel BPF features, added loxilb as LFS
> > > > (Linux from scratch) addon software, loxilb 0.9.8.3 has libbpf 1.5.0
> > > > which has loongarch support [2]. The same loxilb addon runs fine on
> > > > x86 architecture. Any clue on this?
> > > >
> > > > [0]: https://github.com/vincentmli/BPFire/issues/76
> > > > [1]: https://github.com/ipfire/ipfire-2.x
> > > > [2]: https://github.com/loxilb-io/loxilb/issues/972
> > > >
> > >
> > > Thanks for your report!
> > >
> > > I have extracted the kernel crash log from your photo with AI so that
> > > people can easily interpret it.
> > >
> >
> > Nice to know AI could do that :)
> >
> > > From a quick glance, it seems related to MIPS JIT. So it would be
> > > helpful if you could locate the eBPF program which triggered this and
> > > dump its JIT'ed BPF instructions.
> > >
> >
> > This is call trace from Loongarch CPU so related to Loongarch BPF JIT.
> > the kernel seems to lockup immediately right after attaching to the
> > network interface. to dump the JIT'ed BPF instructions, maybe just
> > load the BPF program, but not attach it so I can dump the BPF
> > instructions?
>
> Yes!
>
> You can load the eBPF program which triggered the crash manually without
> attaching it, using commands similar to the following:
>
> # Load the program without attaching it
> sudo bpftool prog load hello.o /sys/fs/bpf/hello
>
> # List programs to find its ID
> sudo bpftool prog list
>
> # Dump JIT instructions (replace 123 with your actual program ID)
> sudo bpftool prog dump jited id 123
>
>
> Thanks.

I got it, with the help of LoxiLB maintainer, I am able to only load
the tc BPF program without attaching to the interface when starting
the loxilb process, the kernel call trace shows bpf program
tc_packet_func, I assume that is the program to cause the lockup, here
is the jited tc_packet_func link [0] which has 13064 lines, not sure
it is going to be helpful

[0] https://github.com/user-attachments/files/19171378/tc_packet_func-jited.txt





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux