On Mon, Jan 02, 2023 at 12:09:19PM +0200, Heikki Krogerus wrote: > Hi Sam, > > Sorry to keep you waiting. > > On Thu, Dec 22, 2022 at 09:18:27PM +0100, Samuel Čavoj wrote: > > Hi Heikki, > > > > I gave this a hard look and figured out the issue. Long story short, the > > firmware is clearing the CCI on EC RAM after copying from EC RAM to > > system memory. This happens both when notifications are delivered and > > when a read operation is explicitly performed via _DSM(read). What the > > driver currently does after receiving a notification is performing an > > explicit read. However by this time the CCI in EC RAM has been cleared > > by the AML and the information is lost. > > > > Details: > > > > 1. _DSM(read) of the UCSI device: > > > > Copies all relevant fields of the mailbox data structure from EC RAM > > to a SystemMemory OperationRegion. The last field to be copied is the > > CCI: > > > > [...] > > CCI0 = \_SB.PCI0.SBRG.EC0.ECRD (RefOf (\_SB.PCI0.SBRG.EC0.CCI0)) > > CCI1 = \_SB.PCI0.SBRG.EC0.ECRD (RefOf (\_SB.PCI0.SBRG.EC0.CCI1)) > > CCI2 = \_SB.PCI0.SBRG.EC0.ECRD (RefOf (\_SB.PCI0.SBRG.EC0.CCI2)) > > CCI3 = \_SB.PCI0.SBRG.EC0.ECRD (RefOf (\_SB.PCI0.SBRG.EC0.CCI3)) > > [...] > > > > However, for some reason, this is followed by another two operations, > > which zero-out half of the CCI: > > > > [...] > > \_SB.PCI0.SBRG.EC0.ECWT (Zero, RefOf (\_SB.PCI0.SBRG.EC0.CCI0)) > > \_SB.PCI0.SBRG.EC0.ECWT (Zero, RefOf (\_SB.PCI0.SBRG.EC0.CCI3)) > > > > I don't know why this is present. This does not cause the problem, > > however, only leads to issues if two explicit reads are performed > > consecutively. What does cause problems with the current driver > > implementation is: > > > > 2. The event handler (_Q79) > > > > The _Q79 event handler on my machine is responsible for the UCSI > > notifications. It performs a copy from EC RAM to system memory and > > triggers the 0x80 notification on the UCSI device (called UBTC). > > The final instructions of this handler are: > > > > [...] > > ^^^^UBTC.CCI0 = CCI0 /* \_SB_.PCI0.SBRG.EC0_.CCI0 */ > > ^^^^UBTC.CCI1 = CCI1 /* \_SB_.PCI0.SBRG.EC0_.CCI1 */ > > ^^^^UBTC.CCI2 = CCI2 /* \_SB_.PCI0.SBRG.EC0_.CCI2 */ > > ^^^^UBTC.CCI3 = CCI3 /* \_SB_.PCI0.SBRG.EC0_.CCI3 */ > > USGC = 0xF1 > > CCI0 = Zero // These two lines are the culprit > > CCI3 = Zero > > Local1 = ((Timer - Local0) / 0x2710) > > Notify (UBTC, 0x80) // Status Change > > Release (ECMT) > > > > > > A side note: > > I figured this out by booting up a Windows installation and convincing > > the local kernel debugger to dump ACPI trace information to a file. The file > > contained an outrageous amount of information with inconsistent formatting > > (missing commas and stuff) for which I wrote an extremely janky parser in > > Python. This let me examine the exact steps performed by the Windows driver. > > And the difference I noticed is that the Linux driver was issuing a > > _DSM(read) > > after every notification, reading the already-zeroed-out CCI. > > > > Patching the AML to remove the zeroing-out instructions seemed to work as > > well, but I suppose this is not a good general solution. > > > > Following is a prototype-grade patch, in essence performing soft-reads most > > of > > the time (i.e. just reading from the OpRegion and not calling _DSM) and > > explicit > > reads when necessary. I am unfortunately not familiar with the spec and the > > hardware in the wild and I understand that it is possible that some devices > > on > > the other hand do not synchronize the mailbox when notifying and it needs to > > be > > done explicitly. I suppose we'd need a parameter to configure this behaviour > > in > > that case with a quirk system. The patch works on my system. Some other > > issues > > surface later, but I think they are related to a particular cheap dongle I > > have > > because they don't seem to occur without it. > > > > What do you think about this situation? Is the patch reasonable, or does it > > need > > a significant re-think? > > This is great! Thank you so much for figuring this one out! > > I think your patch looks totally reasonable, but let me test it with > a couple of different systems that I have. I'll check the event > handlers from the ACPI tables. If that is how Windows works in > general, then this is what has to be done. I did not manage to test all the boards that I was hoping to test, but I think we could move forward with this. Can you format the patch properly and send it to this mailing list? thanks, -- heikki