Re: This is the fourth time I've tried to find what led to the regression of outgoing network speed and each time I find the merge commit 8c94ccc7cd691472461448f98e2372c75849406c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 26.02.24 10:24, Mathias Nyman wrote:
> On 26.2.2024 7.45, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 21.02.24 14:44, Mathias Nyman wrote:
>>> On 21.2.2024 1.43, Randy Dunlap wrote:
>>>> On 2/20/24 15:41, Randy Dunlap wrote:
>>>>> {+ tglx]
>>>>> On 2/20/24 15:19, Mikhail Gavrilov wrote:
>>>>>> On Mon, Feb 19, 2024 at 2:41 PM Mikhail Gavrilov
>>>>>> <mikhail.v.gavrilov@xxxxxxxxx> wrote:
>>>>>> I spotted network performance regression and it turned out, this was
>>>>>> due to the network card getting other interrupt. It is a side effect
>>>>>> of commit 57e153dfd0e7a080373fe5853c5609443d97fa5a.
>>>>> That's a merge commit (AFAIK, maybe not so much). The commit in
>>>>> mainline is:
>>>>>
>>>>> commit f977f4c9301c
>>>>> Author: Niklas Neronin <niklas.neronin@xxxxxxxxxxxxxxx>
>>>>> Date:   Fri Dec 1 17:06:40 2023 +0200
>>>>>
>>>>>       xhci: add handler for only one interrupt line
>>>>>
>>>>>> Installing irqbalance daemon did not help. Maybe someone experienced
>>>>>> such a problem?
>>>>>
>>>>> Thomas, would you look at this, please?
>>>>>
>>>>> A network device and xhci (USB) driver are now sharing interrupts.
>>>>> This causes a large performance decrease for the networking device.
>>>
>>> Short recap:
>>
>> Thx for that. As the 6.8 release is merely two or three weeks away while
>> a fix is nowhere near in sight yet (afaics!) I start to wonder if we
>> should consider a revert here and try reapplying the culprit in a later
>> cycle when this problem is fixed.

Thx for the reply.

> I don't think reverting this series is a solution.
> 
> This isn't really about those usb xhci patches.
> This is about which interrupt gets assigned to which CPU.

I know, but from my understanding of Linus expectations wrt to handling
regressions it does not matter much if a bug existed earlier or
somewhere else: what counts is the commit that exposed the problem.

But I might be wrong here. Anyway, not CCing Linus for this; but I'll
likely point him to this direction on Sunday in my next weekly report,
unless some fix comes into sight.

> Mikhail got unlucky when the network adapter interrupts on that system was
> assigned to CPU0, clearly a more "clogged" CPU, thus causing a drop in max
> bandwidth.

But maybe others will be just as "unlucky". Or is there anything to
believe otherwise? Maybe some aspect of the .config or local setup that
is most likely unique to Mikhail's setup?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux