Hi Jon, On Wed, 29 Jan 2025 14:51:35 +0000, Jon Hunter <jonathanh@xxxxxxxxxx> wrote: > Hi Furong, > > On 27/01/2025 13:28, Thierry Reding wrote: > > On Sat, Jan 25, 2025 at 11:03:47PM +0800, Furong Xu wrote: > >> Hi Thierry > >> > >> On Sat, 25 Jan 2025 12:20:38 +0200, Ido Schimmel wrote: > >> > >>> On Fri, Jan 24, 2025 at 10:42:56AM +0800, Furong Xu wrote: > >>>> On Thu, 23 Jan 2025 22:48:42 +0100, Andrew Lunn <andrew@xxxxxxx> > >>>> wrote: > >>>>>> Just to clarify, the patch that you had us try was not intended > >>>>>> as an actual fix, correct? It was only for diagnostic purposes, > >>>>>> i.e. to see if there is some kind of cache coherence issue, > >>>>>> which seems to be the case? So perhaps the only fix needed is > >>>>>> to add dma-coherent to our device tree? > >>>>> > >>>>> That sounds quite error prone. How many other DT blobs are > >>>>> missing the property? If the memory should be coherent, i would > >>>>> expect the driver to allocate coherent memory. Or the driver > >>>>> needs to handle non-coherent memory and add the necessary > >>>>> flush/invalidates etc. > >>>> > >>>> stmmac driver does the necessary cache flush/invalidates to > >>>> maintain cache lines explicitly. > >>> > >>> Given the problem happens when the kernel performs syncing, is it > >>> possible that there is a problem with how the syncing is performed? > >>> > >>> I am not familiar with this driver, but it seems to allocate multiple > >>> buffers per packet when split header is enabled and these buffers are > >>> allocated from the same page pool (see stmmac_init_rx_buffers()). > >>> Despite that, the driver is creating the page pool with a non-zero > >>> offset (see __alloc_dma_rx_desc_resources()) to avoid syncing the > >>> headroom, which is only present in the head buffer. > >>> > >>> I asked Thierry to test the following patch [1] and initial testing > >>> seems OK. He also confirmed that "SPH feature enabled" shows up in the > >>> kernel log. > >> > >> It is recommended to disable the "SPH feature" by default unless some > >> certain cases depend on it. Like Ido said, two large buffers being > >> allocated from the same page pool for each packet, this is a huge waste > >> of memory, and brings performance drops for most of general cases. > >> > >> Our downstream driver and two mainline drivers disable SPH by default: > >> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/ethernet/stmicro/stmmac/dwmac-dwc-qos-eth.c#n357 > >> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c#n471 > > > > Okay, that's something we can look into changing. What would be an > > example of a use-case depending on SPH? Also, isn't this something > > that should be a policy that users can configure? > > > > Irrespective of that we should fix the problems we are seeing with > > SPH enabled. > > > Any update on this? Sorry for my late response, I was on Chinese New Year holiday. The fix is sent, and it will be so nice to have your Tested-by: tag there: https://lore.kernel.org/all/20250207085639.13580-1-0x1207@xxxxxxxxx/