On Fri, Jun 30, 2023 at 3:02 PM Thorsten Leemhuis <regressions@xxxxxxxxxxxxx> wrote: > > On 27.06.23 00:34, Nick Hastings wrote: > > * Linux regression tracking (Thorsten Leemhuis) <regressions@xxxxxxxxxxxxx> [230626 21:09]: > >> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting > >> for once, to make this easily accessible to everyone. > >> > >> Nick, what's the status/was there any progress? Did you do what Mario > >> suggested and file a nouveau bug? > > > > It was not apparent that the suggestion to open "a Nouveau drm bug" was > > addressed to me. > > I wish things were earlier for reporters, but from what I can see this > is the only way forward if you or some silent bystander cares. > > >> I ask, as I still have this on my list of regressions and it seems there > >> was no progress in three+ weeks now. > > > > I have not pursued this further since as far as I could tell I already > > provided all requested information and I don't actually use nouveau, so > > I blacklisted it. > > I doubt any developer cares enough to take a closer look[1] without a > proper nouveau bug and some help & prodding from someone affected. And > looks to me like reverting the culprit now might create even bigger > problems for users. > > Hence I guess then this won't be fixed in the end. In a ideal world this > would not happen, but we don't live in one and all have just 24 hours in > a day. :-/ > We recently merged this commit: https://gitlab.freedesktop.org/drm/nouveau/-/commit/11d24327c2d7ad7f24fcc44fb00e1fa91ebf6525 It might resolve the problem. Worth testing at least, but I can't remember if this was a hybrid AMD/Nvidia system, but I think it was? > Nevertheless: thx for your report your help through this thread. > > [1] some points on the following page kinda explain this > https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/ > > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > -- > Everything you wanna know about Linux kernel regression tracking: > https://linux-regtracking.leemhuis.info/about/#tldr > If I did something stupid, please tell me, as explained on that page. > > #regzbot inconclusive: reporting deadlock (see thread for details) > > > > >> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > >> -- > >> Everything you wanna know about Linux kernel regression tracking: > >> https://linux-regtracking.leemhuis.info/about/#tldr > >> If I did something stupid, please tell me, as explained on that page. > >> > >> #regzbot backburner: slow progress, likely just affects one machine > >> #regzbot poke > >> > >> > >> On 02.06.23 02:57, Limonciello, Mario wrote: > >>> [AMD Official Use Only - General] > >>> > >>>> -----Original Message----- > >>>> From: Nick Hastings <nicholaschastings@xxxxxxxxx> > >>>> Sent: Thursday, June 1, 2023 7:02 PM > >>>> To: Karol Herbst <kherbst@xxxxxxxxxx> > >>>> Cc: Limonciello, Mario <Mario.Limonciello@xxxxxxx>; Lyude Paul > >>>> <lyude@xxxxxxxxxx>; Lukas Wunner <lukas@xxxxxxxxx>; Salvatore > >>>> Bonaccorso <carnil@xxxxxxxxxx>; 1036530@xxxxxxxxxxxxxxx; Rafael J. > >>>> Wysocki <rafael@xxxxxxxxxx>; Len Brown <lenb@xxxxxxxxxx>; linux- > >>>> acpi@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > >>>> regressions@xxxxxxxxxxxxxxx > >>>> Subject: Re: Regression from "ACPI: OSI: Remove Linux-Dell-Video _OSI > >>>> string"? (was: Re: Bug#1036530: linux-signed-amd64: Hard lock up of system) > >>>> > >>>> Hi, > >>>> > >>>> * Karol Herbst <kherbst@xxxxxxxxxx> [230602 03:10]: > >>>>> On Thu, Jun 1, 2023 at 7:21 PM Limonciello, Mario > >>>>> <Mario.Limonciello@xxxxxxx> wrote: > >>>>>>> -----Original Message----- > >>>>>>> From: Karol Herbst <kherbst@xxxxxxxxxx> > >>>>>>> Sent: Thursday, June 1, 2023 12:19 PM > >>>>>>> To: Limonciello, Mario <Mario.Limonciello@xxxxxxx> > >>>>>>> Cc: Nick Hastings <nicholaschastings@xxxxxxxxx>; Lyude Paul > >>>>>>> <lyude@xxxxxxxxxx>; Lukas Wunner <lukas@xxxxxxxxx>; Salvatore > >>>>>>> Bonaccorso <carnil@xxxxxxxxxx>; 1036530@xxxxxxxxxxxxxxx; Rafael J. > >>>>>>> Wysocki <rafael@xxxxxxxxxx>; Len Brown <lenb@xxxxxxxxxx>; linux- > >>>>>>> acpi@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > >>>>>>> regressions@xxxxxxxxxxxxxxx > >>>>>>> Subject: Re: Regression from "ACPI: OSI: Remove Linux-Dell-Video _OSI > >>>>>>> string"? (was: Re: Bug#1036530: linux-signed-amd64: Hard lock up of > >>>> system) > >>>>>>> > >>>>>>> On Thu, Jun 1, 2023 at 6:54 PM Limonciello, Mario > >>>>>>> <Mario.Limonciello@xxxxxxx> wrote: > >>>>>>>> > >>>>>>>> [AMD Official Use Only - General] > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: Karol Herbst <kherbst@xxxxxxxxxx> > >>>>>>>>> Sent: Thursday, June 1, 2023 11:33 AM > >>>>>>>>> To: Limonciello, Mario <Mario.Limonciello@xxxxxxx> > >>>>>>>>> Cc: Nick Hastings <nicholaschastings@xxxxxxxxx>; Lyude Paul > >>>>>>>>> <lyude@xxxxxxxxxx>; Lukas Wunner <lukas@xxxxxxxxx>; Salvatore > >>>>>>>>> Bonaccorso <carnil@xxxxxxxxxx>; 1036530@xxxxxxxxxxxxxxx; Rafael > >>>> J. > >>>>>>>>> Wysocki <rafael@xxxxxxxxxx>; Len Brown <lenb@xxxxxxxxxx>; linux- > >>>>>>>>> acpi@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > >>>>>>>>> regressions@xxxxxxxxxxxxxxx > >>>>>>>>> Subject: Re: Regression from "ACPI: OSI: Remove Linux-Dell-Video > >>>> _OSI > >>>>>>>>> string"? (was: Re: Bug#1036530: linux-signed-amd64: Hard lock up of > >>>>>>> system) > >>>>>>>>> > >>>>>>>>> On Thu, Jun 1, 2023 at 6:18 PM Limonciello, Mario > >>>>>>>>>> > >>>>>>>>>> Lyude, Lukas, Karol > >>>>>>>>>> > >>>>>>>>>> This thread is in relation to this commit: > >>>>>>>>>> > >>>>>>>>>> 24867516f06d ("ACPI: OSI: Remove Linux-Dell-Video _OSI string") > >>>>>>>>>> > >>>>>>>>>> Nick has found that runtime PM is *not* working for nouveau. > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> keep in mind we have a list of PCIe controllers where we apply a > >>>>>>>>> workaround: > >>>>>>>>> > >>>>>>> > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers > >>>>>>>>> /gpu/drm/nouveau/nouveau_drm.c?h=v6.4-rc4#n682 > >>>>>>>>> > >>>>>>>>> And I suspect there might be one or two more IDs we'll have to add > >>>>>>>>> there. Do we have any logs? > >>>>>>>> > >>>>>>>> There's some archived onto the distro bug. Search this page for > >>>>>>> "journalctl.log.gz" > >>>>>>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1036530 > >>>>>>>> > >>>>>>> > >>>>>>> interesting.. It seems to be the same controller used here. I wonder > >>>>>>> if the pci topology is different or if the workaround is applied at > >>>>>>> all. > >>>>>> > >>>>>> I didn't see the message in the log about the workaround being applied > >>>>>> in that log, so I guess PCI topology difference is a likely suspect. > >>>>>> > >>>>> > >>>>> yeah, but I also couldn't see a log with the usual nouveau messages, > >>>>> so it's kinda weird. > >>>>> > >>>>> Anyway, the output of `lspci -tvnn` would help > >>>> > >>>> % lspci -tvnn > >>>> -[0000:00]-+-00.0 Intel Corporation Device [8086:3e20] > >>>> +-01.0-[01]----00.0 NVIDIA Corporation TU117M [GeForce GTX 1650 > >>>> Mobile / Max-Q] [10de:1f91] > >>> > >>> So the bridge it's connected to is the same that the quirk *should have been* triggering. > >>> > >>> May 29 15:02:42 xps kernel: pci 0000:00:01.0: [8086:1901] type 01 class 0x060400 > >>> > >>> Since the quirk isn't working and this is still a problem in 6.4-rc4 I suggest opening a > >>> Nouveau drm bug to figure out why. > >>> > >>>> +-02.0 Intel Corporation CoffeeLake-H GT2 [UHD Graphics 630] > >>>> [8086:3e9b] > >>>> +-04.0 Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core > >>>> Processor Thermal Subsystem [8086:1903] > >>>> +-08.0 Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / > >>>> 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911] > >>>> +-12.0 Intel Corporation Cannon Lake PCH Thermal Controller > >>>> [8086:a379] > >>>> +-14.0 Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller > >>>> [8086:a36d] > >>>> +-14.2 Intel Corporation Cannon Lake PCH Shared SRAM [8086:a36f] > >>>> +-15.0 Intel Corporation Cannon Lake PCH Serial IO I2C Controller #0 > >>>> [8086:a368] > >>>> +-15.1 Intel Corporation Cannon Lake PCH Serial IO I2C Controller #1 > >>>> [8086:a369] > >>>> +-16.0 Intel Corporation Cannon Lake PCH HECI Controller [8086:a360] > >>>> +-17.0 Intel Corporation Cannon Lake Mobile PCH SATA AHCI Controller > >>>> [8086:a353] > >>>> +-1b.0-[02-3a]----00.0-[03-3a]--+-00.0-[04]----00.0 Intel Corporation > >>>> JHL6340 Thunderbolt 3 NHI (C step) [Alpine Ridge 2C 2016] [8086:15d9] > >>>> | +-01.0-[05-39]-- > >>>> | \-02.0-[3a]----00.0 Intel Corporation JHL6340 > >>>> Thunderbolt 3 USB 3.1 Controller (C step) [Alpine Ridge 2C 2016] > >>>> [8086:15db] > >>>> +-1c.0-[3b]----00.0 Intel Corporation Wi-Fi 6 AX200 [8086:2723] > >>>> +-1c.4-[3c]----00.0 Realtek Semiconductor Co., Ltd. RTS525A PCI > >>>> Express Card Reader [10ec:525a] > >>>> +-1d.0-[3d]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller > >>>> SM981/PM981/PM983 [144d:a808] > >>>> +-1f.0 Intel Corporation Cannon Lake LPC Controller [8086:a30e] > >>>> +-1f.3 Intel Corporation Cannon Lake PCH cAVS [8086:a348] > >>>> +-1f.4 Intel Corporation Cannon Lake PCH SMBus Controller > >>>> [8086:a323] > >>>> \-1f.5 Intel Corporation Cannon Lake PCH SPI Controller > >>>> [8086:a324] > >>>> > >>>> > >>>> Regards, > >>>> > >>>> Nick. > >>> > > > > > > >