On Thu, Jan 31, 2019 at 10:51:03AM -0800, Linus Torvalds wrote: > On Thu, Jan 31, 2019 at 10:35 AM Jarkko Sakkinen > <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote: > > > > OK, so the length of the response is not trashed, but only the error > > code. The attached patch fully fixes the issue. > > > > Here's the header again: > > > > struct tpm_output_header { > > __be16 tag; > > __be32 length; > > __be32 return_code; > > } __packed; > > > > The first to fields *are* read correctly and the last field get 1's > > (thus TPM error -1). > > Ok, so this makes sense, even though that patch is (I think) completely wrong. I don't disagree on that :-) Just pinpointed the location where it fails. > What happens is that the 32-bit fields are mis-aligned: the "tag" is > obviously properly 16-bit aligned, but then both "length" and > "return_code" are 32-bit fields that are only aligned on a 16-bit > alignment. > > What happens is that first you copy the two first fields: > > memcpy_fromio(buf, priv->rsp, 6); > > which copies "tag" and "length", but it copies them by reading then as > a 4-byte and then 2-byte value (in that order). So it actually reads > 'tag' and 'first two bytes of 'length', and then the second access > reads the last two bytes of 'length' > > And it all works, because the accesses are aligned by address of > access, even though they are *not* aligned in the 'struct > tpm_output_header' fields. Right, they are still naturally aligned accesses. > But then later on, when you read 'return_code', and do > > memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6); > > you now do a 4-byte memcpy at offset 6. So it does a 4-byte access, > bit it's not 4-byte aligned. > > Honestly, memcpy() itself shouldn't have worked *either*, but you > lucked out. Gcc doesn't know that it's a 4-byte access, so it actually > calls out to the memcpy() routine. And that one happened to be "rep > movsb" on your machine. And that happened to work. I understand what you mean. Just surprised that this hasn't failed before to anyone (the same driver has been even successfully used on ARM64 with TrustZone based fTPM implementation). It has been in for three years now. > But it's really not supposed to work, and it really *wouldn't* have > worked if somebody disabled the rep-string functions. > > In fact, we have another patch (that isn't applied) that makes even > the memcpy_erms() just call the sw version of memcpy() for short > copies (because "rep movsb" is slow for those cases). That would also > have broken your driver. > > Linus /Jarkko