Hi, > From: Benjamin Tissoires [mailto:benjamin.tissoires@xxxxxxxxx] > Subject: Re: [PATCH v2 3/3] ACPI / button: Send "open" state after > boot/resume > > Hi Lv, > > On Tue, May 31, 2016 at 4:55 AM, Zheng, Lv <lv.zheng@xxxxxxxxx> wrote: > > Hi, > > > >> From: Benjamin Tissoires [mailto:benjamin.tissoires@xxxxxxxxx] > >> Subject: Re: [PATCH v2 3/3] ACPI / button: Send "open" state after > >> boot/resume > >> > >> On Fri, May 27, 2016 at 9:16 AM, Lv Zheng <lv.zheng@xxxxxxxxx> > wrote: > [snipped > ]>> As Valdis replied on 0/3, I don't think this is a good solution (even > >> temporary). Linux should not assume the current state of a input > >> device, and sending unconditionally 1 here is wrong. If the device is > >> on a docking station, you will wake up the wrong monitor and screw > the > >> user session (and this will be a regression). > > [Lv Zheng] > > We are doing the test to see how this behaves on several different > platforms. > > > >> > >> How about we simply send the current LID state stored in the ACPI? > >> something like calling acpi_lid_send_state() directly? > > [Lv Zheng] > > This is what we are going to eliminate in [PATCH 01]. > > We have several real bugs related to sending a wrong state to the > userspace. > > Userspace will suspend right after resume because of the 'close' state. > > On the other hand, you are trying to remove 23de5d9ef2a4bbc4f733f, a > patch that has been around for 9 years and we only start seeing > devices where this logic is not working... > > I am not saying your approach is wrong, I am just saying that instead > of a plain revert, we should probably be more conservative and add a > quirk for those buggy machines. Ideally, we should try to understand > why there is such an issue that Windows doesn't have (the solution > might just be that given Windows doesn't care, we are screwed). > > BTW, on the Surface 3, there is a WMI > (f7cc25ec-d20b-404c-8903-0ed4359c18ae -> WQHE) which returns the > actual value of the LID, without using PNP0C0D at all. I have a > feeling that Windows might use it when it is in trouble or in an > unsure state. I couldn't find this WMI on the 2 other systems so that > may also be just a one shot for the Surface 3. [Lv Zheng] Thanks for the information. But it seems Surface 3 is not a good example for this. It is a runtime idle platform. And the root cause of the Surface 3 issue should be in the freeze code. After waking the system up via LID irq, the irq is dropped. But I guess it is risky to invoke irq handler right there, doing so could break may existing drivers. > > [snipped] > > [Lv Zheng] > > The understanding here is incorrect. > > We have 3 bogus devices. > > 1 of them is surface 3 which is a hardware reduced platform. > > The others are all traditional platforms. > > > > ===== > > The facts are: > > > > Both the platforms return cached lid state from _LID. > > The cached value will be updated by lid irq (via GPIO IRQ, GPE, or EC > event). > > AML tables will send lid notification in the irq handler. > > > > Some AML tables will update the cached value in _WAK (I'll describe why > it is necessary below). > > But updating the cached value in _WAK is not guaranteed by all AML > tables. > > > > For the 'close' state irq, all tables will send lid close notification. > > For the 'open' state irq, it seems there are tables never sending lid open > notification (sounds like Windows do not care about lid open). > > ===== > > > > Surface 3 is entirely a different case. > > It is a runtime idle system and hardware reduced. > > On that kind of system, lid open is handled by OS not by BIOS. > > Surface 3 is exactly the platform that doesn't send lid open notification. > > I guess the AML is intentionally written in this way to be compliant to > the traditional platforms. > > > > While on the traditional platforms: > > When lid is opened, BIOS handles the lid irq and wakes the system from > the FACS waking vector. > > So it is likely that there is no lid open irq after the system is resumed. > > BIOS may forget to update the cached lid value in the _WAK or some > other control methods that could be executed after resuming. > > Then if we send _LID result to the user space, the cached value could > apparently be 'close'. > > > > That explains why there is no "lid open" configuration in the "Windows > Device Manager". > > > >> > >> I propose as a workaround to enable a kthread that will monitor the > >> lid state and update the correct value to userspace (5 sec of polling > >> time should be enough given that systemd checks every 20 sec). > >> We should probably have this workaround only for a set of known > >> devices, as it might just be temporary for those until the actual > >> underlying problem is fixed (wrong DSDT in the Surface 3 case that > >> doesn't notify at all, issue in the EC for the Surface Pro 1 and the > >> Samsung N210). > > [Lv Zheng] > > That cannot help to solve the issue/gap. > > > > The problem is Linux userspace has a facility re-checking lid state when > lid state is "close". > > If it failed to update the lid state to "open" for a timeout period, it would > suspend the platform. > > > > We actually are recommending Linus userspace to introduce a new lid > handling mode to only handle "lid close" event and ignore "lid open" event. > > During the period this is not changed from the userspace, we have to > send "open" after resume/boot in order not to trigger the timeout > mechanism. > > > > I hear your points and I couldn't agree more that the only solution > for us is (sadly) to mimic Windows. > > I am just disagreeing on the method used here: I'd rather have a > fallback mechanism where the lid is not sent to "one" at each boot > resume because this will mess us people using docking stations. The > problem is that Red Hat has a lot of them, and I foresee a lot of > complains to me if this is merged in its current state. > > We could have a parameter (say "send_lid_open_on_resume" or > "use_bios_lid_value") in acpi/button.c which would allow people to > choose if they want the "new" behavior or the old one. And we could > also add some DMI matching for the messed up laptops/tablets where we > force one or the other quirk. When we agree that user space now > behaves gently with us, we could then remove entirely the quirk and > the dmi matching. > > How does that sound? [Lv Zheng] The choices are in my first revision. I could restore it back and re-send this series. Also I need to update PATCH 02 to eliminate wrong IS_ERR_VALUE(). > > Also, on the topic of the .notify not being sent by some BIOS, I have > seen something similar on the surface 3. > > The ASL for the LID status change is the following: > --- > Method (_E4C, 0, Serialized) // _Exx: Edge-Triggered GPE > { > If ((HELD == One)) > { > ^^LID.LIDB = One > } > Else > { > ^^LID.LIDB = Zero > Notify (LID, 0x80) // Status Change > } > > Notify (^^PCI0.SPI1.NTRG, One) // Device Check > } > --- > > I noticed I received the hotplug event on the touchscreen when the LID > state changed, and tried to initiate the notify on the LID acpi device > from within the touchscreen driver. > This works well, except that from time to time, on resume, I don't > even receive the hotplug notification. It is as if the notification is > lost while I receive it 80% of the time. [Lv Zheng] Probably this is because of the same reason as mentioned before. I still think Surface 3 is a different example. On those traditional platforms, we can see that: 1. There are platforms returning hardware state directly from _LID 2. There are platforms returning cached state from _LID and update it via lid irq, EC._REG, _WAK 3. There are platforms returning cached state from _LID and update it via lid irq The fact is there are simply so many type 3 platforms. You might be thinking there are only several such kind of bogus devices. But actually our team has been working on customizing those platforms, enhancing their EC._REG/_WAK for years in order to make the AML tables working on Linux. This looks like an endless work to me... > > This might be completely unrelated, but I think it is worth mentioning > because there might be a common root at the state not being updated > correctly. > > Cheers, and sorry for being annoying :) [Lv Zheng] I didn't see anything annoying. :) Cheers -Lv > Benjamin ��.n��������+%������w��{.n�����{�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f