RE: [PATCH v2] ravb: On timeout disable IRQs to stop processing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Sergei-san,

> From: Sergei Shtylyov, Sent: Monday, May 18, 2020 5:45 PM
> 
> On 18.05.2020 7:54, Dirk Behme wrote:
> 
> > Analyzing [1] it seems there is a race condition where ravb_start_xmit()
>  > can be called from interrupt while tx skbuffs are being torn down in
>  > the scheduled timeout handling. The actual timeout work is done in
>  > ravb_tx_timeout_work() during which the tx skbuffs are torn down via
>  > invocations of ravb_ring_free(). But there seems to be no flag to tell
>  > the driver it is shutting down so it continues to use the ring buffer
>  > when it should not.
>  >
>  > Fix this by disabling the interrupts in the timeout handler.
> 
>     Hm, given that we stop all TX queues prior to tearing down the buffers,
> it is somewhat strange that you see the driver's send path called...
>     But disabling the interrupts seems the Right Thing anyways...

I didn't reproduce this issue and I didn't know the root cause yet.
But, I'm feeling to strange. Especially, "ravb_start_xmit() can be called from interrupt".
I didn't find where ravb_start_xmit() can be called from interrupt for now.

By the way, I'm thinking the following message is related to the issue.
> > ravb e6800000.ethernet ethernet: failed to switch device to config mode

The ravb_config() output the message if failed. And, ravb_tx_timeout_work()
calls ravb_config() via ravb_stop_dma() and ravb_dmac_init().
---
ravb_tx_timeout_work() {
	ravb_stop_dma()		// call ravb_config()

	ravb_ring_free()	// *1
<snip>
	ravb_dmac_init()	// call ravb_config()
<snip>
}

ravb_dmac_init()
{
<snip>
	error = ravb_config()	// "failed to switch device to config modes" here and -ETIMEDOUT
	if (error)
		return error	// *2
	ravb_ring_init()	// *3
<snip>
}
--

If ravb_config() failed at the *2, since ravb_ring_init() was not called,
any descriptors were not allocated and then the issue should happen.
Note that according to the datasheet, the controller cannot change the
mode from "Operation" to Configuration" when the controller is doing
TX or RX.

I don't know why the first ravb_config() doesn't fail for now.
But, my scenario is one of functions below enables the TX and RX
by calling ravb_rcv_snd_enable():
 - ravb_emac_interrupt_unlocked() ... if ESCR_LCHNG && !no_avb_link && PSR_LMON
 - ravb_adjust_link() ... if no_avb_link && phydev->link
 - ravb_set_rx_csum() ... if enabling NETIF_F_RXCSUM

ravb_tx_timeout_work() calls ravb_stop_dma(). And, ravb_stop_dma() calls
ravb_rcv_snd_disable(). So, we assumed ravb_tx_timeout_work() didn't transfer
anymore. However, if one of the above functions was called, this is possible
to enable TX and RX.

If this scenario is true, I'm thinking we can fix the issue by using
spin_{un}lock_irq{restore,save}() between ravb_stop_dma() to ravb_dmac_init().
# ravb_ptp_init() calls spin_lock API so that we should not spin lock ravb_ptp_{init,stop}().

What do you think?

I'll try to reproduce this issue on my environment again...
# I'm thinking ravb_adjust_link() is possible to cause this issue.
# But, Salvator-X doesn't have renesas,no-ether-link so that I'll try to add it.

Best regards,
Yoshihiro Shimoda





[Index of Archives]     [Linux Samsung SOC]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux