Re: [PATCH 0/2] bugfix for ipoib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 20, 2023 at 09:34:59PM +0100, Jack Wang wrote:
> We run into queue timeout often with call trace as such:
> NETDEV WATCHDOG: ib0.beef (): transmit queue 26 timed out
> Call Trace:
> call_timer_fn+0x27/0x100
> __run_timers.part.0+0x1be/0x230
> ? mlx5_cq_tasklet_cb+0x6d/0x140 [mlx5_core]
> run_timer_softirq+0x26/0x50
> __do_softirq+0xbc/0x26d
> asm_call_irq_on_stack+0xf/0x20
> ib0.beef: transmit timeout: latency 10 msecs
> ib0.beef: queue stopped 0, tx_head 0, tx_tail 0, global_tx_head 0, global_tx_tail 0
> 
> The last two message repeated for days.

You shouldn't get tx timeouts and fully stuck queues like that, it
suggests something else is very wrong in that system.

> After cross check with Mellanox OFED, I noticed some bugfix are missing in
> upstream, hence I take the liberty to send them out.

Recovery is recovery, it is just RAS

Jason




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux