On Fri, Feb 22, 2019 at 02:08:40PM +0100, Hans de Goede wrote: > On most Intel Bay- and Cherry-Trail systems the PMIC is connected over I2C > and the PMIC is accessed through various means by the _PS0 and _PS3 ACPI > methods (power on / off methods) of various devices. > > This leads to suspend/resume ordering problems where a device may be > resumed and get its _PS0 method executed before the I2C controller is > resumed. On Cherry Trail this leads to errors like these: > > i2c_designware 808622C1:06: controller timed out > ACPI Error: AE_ERROR, Returned by Handler for [UserDefinedRegion] > ACPI Error: Method parse/execution failed \_SB.P18W._ON, AE_ERROR > video LNXVIDEO:00: Failed to change power state to D0 > > But on Bay Trail this caused I2C reads to seem to succeed, but they end > up returning wrong data, which ends up getting written back by the typical > read-modify-write cycle done to turn on various power-resources. > > Debugging the problems caused by this silent data corruption is quite > nasty. This commit adds a check which disallows i2c_dw_xfer() calls to > happen until the controller's resume method has completed. > > Which turns the silent data corruption into getting these errors in > dmesg instead: > > i2c_designware 80860F41:04: Error i2c_dw_xfer call while suspended > ACPI Error: AE_ERROR, Returned by Handler for [UserDefinedRegion] > ACPI Error: Method parse/execution failed \_SB.PCI0.GFX0._PS0, AE_ERROR > > Which is much better. > > Note the above errors are an example of issues which this patch will > help to debug, the actual fix requires fixing the suspend order and > this has been fixed by a different commit. > > Note the setting / clearing of the suspended flag in the suspend / resume > methods is NOT protected by i2c_lock_bus(). This is intentional as these > methods get called from i2c_dw_xfer() (through pm_runtime_get/put) a nd > i2c_dw_xfer() is called with the i2c_bus_lock held, so otherwise we would > deadlock. This means that there is a theoretical race between a non runtime > suspend and the suspended check in i2c_dw_xfer(), this is not a problem > since normally we should not hit the race and this check is primarily a > debugging tool so hitting the check if there are suspend/resume ordering > problems does not need to be 100% reliable. Patch itself looks good to me, Reviewed-by: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx> though I'm wondering if this suspended flag can be tracked by PM runtime core since we have a lot of drivers doing this and more coming (at least I have to do the same in 8250). > Signed-off-by: Hans de Goede <hdegoede@xxxxxxxxxx> > --- > Changes in v2: > -Change error return to -ESHUTDOWN > -Add a blurb to the commit message to mention that the errors in the commitmsg > are examples of errors which will be easier to debug with this change and > that this change does not actually fixes these errors > -Add a blurb to the commit message about why i2c_lock_bus() is not called > from the suspend/resume methods > --- > drivers/i2c/busses/i2c-designware-core.h | 2 ++ > drivers/i2c/busses/i2c-designware-master.c | 6 ++++++ > drivers/i2c/busses/i2c-designware-pcidrv.c | 7 ++++++- > drivers/i2c/busses/i2c-designware-platdrv.c | 3 +++ > 4 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/drivers/i2c/busses/i2c-designware-core.h b/drivers/i2c/busses/i2c-designware-core.h > index b4a0b2b99a78..6b4ef1d38fb2 100644 > --- a/drivers/i2c/busses/i2c-designware-core.h > +++ b/drivers/i2c/busses/i2c-designware-core.h > @@ -215,6 +215,7 @@ > * @disable_int: function to disable all interrupts > * @init: function to initialize the I2C hardware > * @mode: operation mode - DW_IC_MASTER or DW_IC_SLAVE > + * @suspended: set to true if the controller is suspended > * > * HCNT and LCNT parameters can be used if the platform knows more accurate > * values than the one computed based only on the input clock frequency. > @@ -270,6 +271,7 @@ struct dw_i2c_dev { > int (*set_sda_hold_time)(struct dw_i2c_dev *dev); > int mode; > struct i2c_bus_recovery_info rinfo; > + bool suspended; > }; > > #define ACCESS_SWAP 0x00000001 > diff --git a/drivers/i2c/busses/i2c-designware-master.c b/drivers/i2c/busses/i2c-designware-master.c > index 8d1bc44d2530..bb8e3f149979 100644 > --- a/drivers/i2c/busses/i2c-designware-master.c > +++ b/drivers/i2c/busses/i2c-designware-master.c > @@ -426,6 +426,12 @@ i2c_dw_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num) > > pm_runtime_get_sync(dev->dev); > > + if (dev->suspended) { > + dev_err(dev->dev, "Error %s call while suspended\n", __func__); > + ret = -ESHUTDOWN; > + goto done_nolock; > + } > + > reinit_completion(&dev->cmd_complete); > dev->msgs = msgs; > dev->msgs_num = num; > diff --git a/drivers/i2c/busses/i2c-designware-pcidrv.c b/drivers/i2c/busses/i2c-designware-pcidrv.c > index d50f80487214..76810deb2de6 100644 > --- a/drivers/i2c/busses/i2c-designware-pcidrv.c > +++ b/drivers/i2c/busses/i2c-designware-pcidrv.c > @@ -176,6 +176,7 @@ static int i2c_dw_pci_suspend(struct device *dev) > struct pci_dev *pdev = to_pci_dev(dev); > struct dw_i2c_dev *i_dev = pci_get_drvdata(pdev); > > + i_dev->suspended = true; > i_dev->disable(i_dev); > > return 0; > @@ -185,8 +186,12 @@ static int i2c_dw_pci_resume(struct device *dev) > { > struct pci_dev *pdev = to_pci_dev(dev); > struct dw_i2c_dev *i_dev = pci_get_drvdata(pdev); > + int ret; > > - return i_dev->init(i_dev); > + ret = i_dev->init(i_dev); > + i_dev->suspended = false; > + > + return ret; > } > #endif > > diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c b/drivers/i2c/busses/i2c-designware-platdrv.c > index 9eaac3be1f63..ead5e7de3e4d 100644 > --- a/drivers/i2c/busses/i2c-designware-platdrv.c > +++ b/drivers/i2c/busses/i2c-designware-platdrv.c > @@ -454,6 +454,8 @@ static int dw_i2c_plat_suspend(struct device *dev) > { > struct dw_i2c_dev *i_dev = dev_get_drvdata(dev); > > + i_dev->suspended = true; > + > if (i_dev->shared_with_punit) > return 0; > > @@ -471,6 +473,7 @@ static int dw_i2c_plat_resume(struct device *dev) > i2c_dw_prepare_clk(i_dev, true); > > i_dev->init(i_dev); > + i_dev->suspended = false; > > return 0; > } > -- > 2.20.1 > -- With Best Regards, Andy Shevchenko