On Wed, Jan 23, 2019 at 05:36:38PM +0200, Jarkko Sakkinen wrote: > On Wed, Jan 23, 2019 at 07:26:42AM +1300, Linus Torvalds wrote: > > On Wed, Jan 23, 2019 at 2:29 AM Jarkko Sakkinen > > <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > Fails on commit 170d13ca3a2fdaaa0283399247631b76b441cca2. Still works on > > > > > preceding commit a959dc88f9c8900296ccf13e2f3e1cbc555a8917. > > > > > > > > This changes the IO access pattern in memcpy_to/fromio.. Presumably > > > > CRB HW doesn't like the new 4 byte move? Swap each one in crb to > > > > memcpy to confirm.. > > > > > > > > If the HW requires particular access patterns you can't use > > > > memcpy_to/fromio > > > > > > Did not have time to look at the commit at all but your deduction > > > is correct. I know it without testing. > > > > > > Memory controller will feed 1's on unaligned read from IO memory, > > > and as we can see from the TPM header, this change causes two of > > > those: > > > > Funky. But how did it work before then? > > > > The new memcpy_fromio() is designed to have _predictable_ access > > patterns. Not necessarily the best, but at least consistent. > > > > Prevously, we used whatever random "memcpy()" implementation we > > happened to pick, which *could* be aligned (particularly "rep movsb" - > > absolutely horrible performance for MMIO, but by doing IO one byte at > > a time it was certainly aligned ;), but most of our x86 memcpy > > implementations don't actually try all that hard to align the source. > > And the manual version will actually copy things *backwards* for some > > cases. > > > > Is it just that this particular hardware always happened to trigger > > the ERMS case (ie "rep movsb")? > > This is the particular snippet in question: > > memcpy_fromio(buf, priv->rsp, 6); > expected = be32_to_cpup((__be32 *) &buf[2]); > if (expected > count || expected < 6) > return -EIO; > > memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6); > > I guess it did in the first memcpy_fromio operation since it is less > than a quad word, right? Not sure why the 2nd memcpy_fromio() operation > has worked, though. And I wonder why 32-bit has worked before. Tomas, you've been more involved with ME and fTPM runs there. Do you have any clues where this could be rooted? /Jarkko