On Mon, Nov 18, 2019 at 03:20:19PM +0100, Hans de Goede wrote: > Hi All, > > First let me copy-paste a bit of the commit msg for background: > > ---begin commit msg--- > Commit 39ce8150a079 ("pinctrl: baytrail: Serialize all register access") > added a spinlock around all register accesses because: > > "There is a hardware issue in Intel Baytrail where concurrent GPIO register > access might result reads of 0xffffffff and writes might get dropped > completely." > > Testing has shown that this does not catch all cases, there are still > 2 problems remaining > > 1) The original fix uses a spinlock per byt_gpio device / struct, > additional testing has shown that this is not sufficient concurent > accesses to 2 different GPIO banks also suffer from the same problem. > > This commit fixes this by moving to a single global lock. > > 2) The original fix did not add a lock around the register accesses in > the suspend/resume handling. > > Since pinctrl-baytrail.c is using normal suspend/resume handlers, > interrupts are still enabled during suspend/resume handling. Nothing > should be using the GPIOs when they are being taken down, _but_ the > GPIOs themselves may still cause interrupts, which are likely to > use (read) the triggering GPIO. So we need to protect against > concurrent GPIO register accesses in the suspend/resume handlers too. > > This commit fixes this by adding the missing spin_lock / unlock calls. > ---end commit msg--- > > I was discussing the problem at my local hackerspace yesterday with > someone who knows quite a bit low-level details about Intel CPUs. > He said that on "big" core's all the GPIO islands sit behind the IOSF > and that there is a bridge which make their registers look like regular > mmio registers. > > I wonder if the same is happening on BYT, that would point to an issue > with concurent accesses being done through the IOSF bridge, which would > explain why I've found that we need a single global lock for all GPIO > islands (*). > But that in turn would mean that we really need global lock > for _all_ accesses done through the IOSF bridge, not just GPIO register > accesses... Which would explain why on some machines BYT has never been > really stable with Linux. I don't think so. The Cherriview seems inherited the same silicon issue (see comment near to chv_lock), though it seems to be fixed for all new hardware (Skylake+). > Can someone at Intel with access to internal docs about BYT dig a bit into > the docs and see if 1. this theory makes sense, 2. if it does if there us > anything else behind the IOSF for which we also need to serialize accesses? Unfortunately I have no time to dig into this right now. > > Regards, > > Hans > > > > > *) It seems that the GPIO interrupt storm on the test device I hit > this is really good in triggering this > -- With Best Regards, Andy Shevchenko