On Fri, Feb 22, 2019 at 02:08:40PM +0100, Hans de Goede wrote: > On most Intel Bay- and Cherry-Trail systems the PMIC is connected over I2C > and the PMIC is accessed through various means by the _PS0 and _PS3 ACPI > methods (power on / off methods) of various devices. > > This leads to suspend/resume ordering problems where a device may be > resumed and get its _PS0 method executed before the I2C controller is > resumed. On Cherry Trail this leads to errors like these: > > i2c_designware 808622C1:06: controller timed out > ACPI Error: AE_ERROR, Returned by Handler for [UserDefinedRegion] > ACPI Error: Method parse/execution failed \_SB.P18W._ON, AE_ERROR > video LNXVIDEO:00: Failed to change power state to D0 > > But on Bay Trail this caused I2C reads to seem to succeed, but they end > up returning wrong data, which ends up getting written back by the typical > read-modify-write cycle done to turn on various power-resources. > > Debugging the problems caused by this silent data corruption is quite > nasty. This commit adds a check which disallows i2c_dw_xfer() calls to > happen until the controller's resume method has completed. > > Which turns the silent data corruption into getting these errors in > dmesg instead: > > i2c_designware 80860F41:04: Error i2c_dw_xfer call while suspended > ACPI Error: AE_ERROR, Returned by Handler for [UserDefinedRegion] > ACPI Error: Method parse/execution failed \_SB.PCI0.GFX0._PS0, AE_ERROR > > Which is much better. > > Note the above errors are an example of issues which this patch will > help to debug, the actual fix requires fixing the suspend order and > this has been fixed by a different commit. > > Note the setting / clearing of the suspended flag in the suspend / resume > methods is NOT protected by i2c_lock_bus(). This is intentional as these > methods get called from i2c_dw_xfer() (through pm_runtime_get/put) a nd > i2c_dw_xfer() is called with the i2c_bus_lock held, so otherwise we would > deadlock. This means that there is a theoretical race between a non runtime > suspend and the suspended check in i2c_dw_xfer(), this is not a problem > since normally we should not hit the race and this check is primarily a > debugging tool so hitting the check if there are suspend/resume ordering > problems does not need to be 100% reliable. > > Signed-off-by: Hans de Goede <hdegoede@xxxxxxxxxx> Applied to for-next, thanks! For the record, a checkpatch check which I think is not much of a problem here, but still wanted to let you know: CHECKPATCH CHECK: Avoid using bool structure members because of possible alignment issues - see: https://lkml.org/lkml/2017/11/21/384 #78: FILE: drivers/i2c/busses/i2c-designware-core.h:274: + bool suspended; ^
Attachment:
signature.asc
Description: PGP signature