On Thu Nov 09 17, Jerry Snitselaar wrote:
On Thu Nov 09 17, Laurent Bigonville wrote:
Le 09/11/17 à 01:04, Laurent Bigonville a écrit :
Le 24/10/17 à 18:07, Jarkko Sakkinen a écrit :
On Tue, Oct 24, 2017 at 07:57:06AM -0700, Jerry Snitselaar wrote:
On Tue Oct 24 17, Jarkko Sakkinen wrote:
On Mon, Oct 23, 2017 at 06:45:15AM -0700, Jerry Snitselaar wrote:
On Mon Oct 23 17, Jarkko Sakkinen wrote:
On Sat, Oct 21, 2017 at 10:53:55AM +0200, Laurent Bigonville wrote:
Le 14/10/17 à 10:13, Jerry Snitselaar a écrit :
On Wed Sep 06 17, Jarkko Sakkinen wrote:
On Fri, Sep 01, 2017 at 02:10:18PM +0200, Laurent
Bigonville wrote:
Le 31/08/17 à 18:40, Jerry Snitselaar a écrit :
On Thu Aug 31 17, Alexander.Steffen@xxxxxxxxxxxx wrote:
Le 29/08/17 à 18:35, Laurent Bigonville a écrit :
Le 29/08/17 à 18:00,
Alexander.Steffen@xxxxxxxxxxxx a écrit :
An idea how to troubleshoot this?
Can you run git bisect on the changes between 4.11 and
4.12, so that
we find the offending commit? It is probably sufficient
to limit the
search to commits that touch something in drivers/char/tpm.
I'll try and keep you posted.
OK I've been able to bisect the problem
and the bad commit is:
e6aef069b6e97790cb127d5eeb86ae9ff0b7b0e3
is the first bad commit
commit e6aef069b6e97790cb127d5eeb86ae9ff0b7b0e3
Author: Jerry Snitselaar <jsnitsel@xxxxxxxxxx>
Date: Mon Mar 27 08:46:04 2017 -0700
tpm_tis: convert to using locality callbacks
This patch converts tpm_tis to use
of the new tpm class ops
request_locality, and relinquish_locality.
With the move to using the
callbacks, release_locality is
changed so
that we now release the locality even if there is no
request pending.
This required some changes to the tpm_tis_core_init
code path to
make sure locality is requested when needed:
- tpm2_probe code path will end up calling
request/release through
callbacks, so request_locality prior to
tpm2_probe not needed.
- probe_itpm makes calls to
tpm_tis_send_data which no
longer calls
request_locality, so add request_locality prior to
tpm_tis_send_data
calls. Also drop release_locality call in middleof
probe_itpm, and
keep locality until
release_locality called at end of
probe_itpm.
Cc: Peter Huewe <peterhuewe@xxxxxx>
Cc: Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx>
Cc: Jason Gunthorpe <jgunthorpe@xxxxxxxxxxxxxxxxxxxx>
Cc: Marcel Selhorst <tpmdd@xxxxxxxxxxxx>
Signed-off-by: Jerry Snitselaar <jsnitsel@xxxxxxxxxx>
Reviewed-by: Jarkko Sakkinen
<jarkko.sakkinen@xxxxxxxxxxxxxxx>
Tested-by: Jarkko Sakkinen
<jarkko.sakkinen@xxxxxxxxxxxxxxx>
Signed-off-by: Jarkko Sakkinen
<jarkko.sakkinen@xxxxxxxxxxxxxxx>
:040000 040000 70234365da69959d47076ebb40c8d17f520c3e44
72f21b446e45ea1003de75902b0553deb99157fd M drivers
I've looked again at the code in question, but could not find
anything that is obviously wrong there. Locality is now
requested/released at slightly different
points in the process than
before, but that's it. It does not seem to
cause problems with the
majority of TPMs, since you are the first to report any, so
maybe it
is a quirk that only affects this device.
Perhaps Jerry can help, since this is his change?
Alexander
Getting some caffeine in me, and starting to
take a look. Adding
Jarkko as well since this might involve the
general locality changes.
Laurent, if I send you a patch with some
debugging code added, would
you be able to run it on that system? I wasn't
running into issues
on the system I had with a 1.2 device, but I
no longer have access
to it. I'll see if I can find one in our labs
and reproduce it there.
Yes I should be able to do that
Any findings?
/Jarkko
Okay, finally getting back to this. Looking at the
code it isn't clear
to me
why the change is causing this. So while I stare at this some more
Laurent
could you reproduce it with this patch so I can see
what the status and
access registers look like? Does anyone else on here
happen to have a
Sinosun
tpm device? The systems I have access to with TPM1.2
devices don't have
this
issue.
--8<--
diff --git a/drivers/char/tpm/tpm_tis_core.c
b/drivers/char/tpm/tpm_tis_core.c
index fdde971bc810..7d60a7e4b50a 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -258,6 +258,7 @@ static int
tpm_tis_send_data(struct tpm_chip *chip,
const u8 *buf, size_t len)
int rc, status, burstcnt;
size_t count = 0;
bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
+ u8 access;
status = tpm_tis_status(chip);
if ((status & TPM_STS_COMMAND_READY) == 0) {
@@ -292,6 +293,11 @@ static int
tpm_tis_send_data(struct tpm_chip *chip,
const u8 *buf, size_t len)
}
status = tpm_tis_status(chip);
if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
+ rc = tpm_tis_read8(priv, TPM_ACCESS(priv->locality),
&access);
+ if (rc < 0)
+ dev_info(&chip->dev,
"TPM_STS_DATA_EXPECT == 0: read
failure TPM_ACCESS(%d)\n", priv->locality);
+ else
+ dev_info(&chip->dev, "TPM_STS_DATA_EXPECT == 0:
locality: %d status: %x access: %x\n",
priv->locality, status, access);
rc = -EIO;
goto out_err;
}
@@ -309,6 +315,11 @@ static int
tpm_tis_send_data(struct tpm_chip *chip,
const u8 *buf, size_t len)
}
status = tpm_tis_status(chip);
if (!itpm && (status & TPM_STS_DATA_EXPECT) != 0) {
+ rc = tpm_tis_read8(priv,
TPM_ACCESS(priv->locality), &access);
+ if (rc < 0)
+ dev_info(&chip->dev, "TPM_STS_DATA_EXPECT != 0: read
failure TPM_ACCESS(%d)\n", priv->locality);
+ else
+ dev_info(&chip->dev,
"TPM_STS_DATA_EXPECT != 0: locality:
%d status: %x access: %x\n", priv->locality, status, access);
rc = -EIO;
goto out_err;
}
Please find here the dmesg output of the patched kernel
At least 0xff is corrupted value in senseful way. CPU fills the read
with ones for example for unaligned bus read. See table
19 in PC client
spec. This can happen when you do unaligned read for example.
Maybe TPM is unreachable i.e. powered off. Bit busy with
stuff ATM but
would probably make sense to compare that 0x81 to table
18 in the same
spec.
/Jarkko
0x81 is saying the access register status is valid, and the locality
is not active. That first bit means A Dynamic OS has not
been previously
established on the platform. Normally we would see 0xa1, which would
mean valid register status, and the locality is active.
I think the important thing to note here is that STS has bits set that
should never be set. So we can conclude that TPM might be either
1. Powered off
2. In some transition state?
/Jarkko
If it was powered off would we be getting a valid read from the access
register?
I think there is no universal answer to that :-)
Maybe adding a extra delay would be next test to make? If for random
reason it is in-between states...
Any more ideas?
The chip is definitely in a weird state :/ I tried several ways to
reset the chip (windows, tpm-tools,...).
I've been able to reset the chip via the bios (which now shows
unowned) but chip is still locked apparently.
But still with < 4.12 I'm able to get /some/ information Public
EK, PCR,... out of the chip so it's not completely broken...
OK correction the TPM is now unlocked (I let the computer running
for more than 24h with nothing accessing the TPM) and with 4.9 I've
been able to take the ownership again.
Under 4.12 I still have the same errors as mentioned originally
Still thinking about how to attack this. I don't think extending the timeout
would change anything because it is reading the register, and thinking the
status is valid because it reads 0xff. So it would still have that problem,
right?
Maybe a sanity check could be added to wait_for_tpm_stat() to
first check that it didn't read 0xff before checking for the value it
is looking for. Would that make sense Jarkko?
diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 1d6729be4cd6..8b77fa2003ce 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -1087,7 +1087,7 @@ int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask, unsigned long timeout,
do {
tpm_msleep(TPM_TIMEOUT);
status = chip->ops->status(chip);
- if ((status & mask) == mask)
+ if ((status != 0xff) && (status & mask) == mask)
return 0;
} while (time_before(jiffies, stop));
}
Actually I'm reading through the tpm profile spec again, and 6.1 FIFO
Interface Locality Usage per Register has a table that I think
explains what is going on.
Since the access register is showing 0x81, which means access register
bits are valid and the locality is not active. Table 39 in 6.1 says in
the case where active locality is not set, or set for another
locality, TPM_STS_x register reads return FFh.