Hi,
On 18-11-2019 18:23, Andy Shevchenko wrote:
On Mon, Nov 18, 2019 at 03:20:19PM +0100, Hans de Goede wrote:
Hi All,
First let me copy-paste a bit of the commit msg for background:
---begin commit msg---
Commit 39ce8150a079 ("pinctrl: baytrail: Serialize all register access")
added a spinlock around all register accesses because:
"There is a hardware issue in Intel Baytrail where concurrent GPIO register
access might result reads of 0xffffffff and writes might get dropped
completely."
Testing has shown that this does not catch all cases, there are still
2 problems remaining
1) The original fix uses a spinlock per byt_gpio device / struct,
additional testing has shown that this is not sufficient concurent
accesses to 2 different GPIO banks also suffer from the same problem.
This commit fixes this by moving to a single global lock.
2) The original fix did not add a lock around the register accesses in
the suspend/resume handling.
Since pinctrl-baytrail.c is using normal suspend/resume handlers,
interrupts are still enabled during suspend/resume handling. Nothing
should be using the GPIOs when they are being taken down, _but_ the
GPIOs themselves may still cause interrupts, which are likely to
use (read) the triggering GPIO. So we need to protect against
concurrent GPIO register accesses in the suspend/resume handlers too.
This commit fixes this by adding the missing spin_lock / unlock calls.
---end commit msg---
I was discussing the problem at my local hackerspace yesterday with
someone who knows quite a bit low-level details about Intel CPUs.
He said that on "big" core's all the GPIO islands sit behind the IOSF
and that there is a bridge which make their registers look like regular
mmio registers.
I wonder if the same is happening on BYT, that would point to an issue
with concurent accesses being done through the IOSF bridge, which would
explain why I've found that we need a single global lock for all GPIO
islands (*).
But that in turn would mean that we really need global lock
for _all_ accesses done through the IOSF bridge, not just GPIO register
accesses... Which would explain why on some machines BYT has never been
really stable with Linux.
I don't think so. The Cherriview seems inherited the same silicon issue (see
comment near to chv_lock), though it seems to be fixed for all new hardware
(Skylake+).
Ah, interestingly enough pinctrl-cherryview.c already uses a global
lock and it also already protects suspend and resume, well at least
that shows that my baytrail patch seems sound / the right thing to do.
Can someone at Intel with access to internal docs about BYT dig a bit into
the docs and see if 1. this theory makes sense, 2. if it does if there us
anything else behind the IOSF for which we also need to serialize accesses?
Unfortunately I have no time to dig into this right now.
Ok.
Regards,
Hans