On Wed, May 27, 2020 at 08:18:56PM +0000, Mario.Limonciello@xxxxxxxx wrote: > > -----Original Message----- > > From: Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> > > Sent: Wednesday, May 27, 2020 3:09 PM > > To: James Bottomley; Limonciello, Mario; peterhuewe@xxxxxx; jgg@xxxxxxxx > > Cc: arnd@xxxxxxxx; gregkh@xxxxxxxxxxxxxxxxxxx; linux-integrity@xxxxxxxxxxxxxxx; > > linux-kernel@xxxxxxxxxxxxxxx; jeffrin@xxxxxxxxxxxxxxxxxxx; alex@xxxxxxxxx > > Subject: Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode" > > > > > > [EXTERNAL EMAIL] What is this? > > On Tue, 2020-05-26 at 12:38 -0700, James Bottomley wrote: > > > On Tue, 2020-05-26 at 19:23 +0000, Mario.Limonciello@xxxxxxxx wrote: > > > > > On Tue, 2020-05-26 at 13:32 -0500, Mario Limonciello wrote: > > > > > > This reverts commit d23d12484307b40eea549b8a858f5fffad913897. > > > > > > > > > > > > This commit has caused regressions for the XPS 9560 containing > > > > > > a Nuvoton TPM. > > > > > > > > > > Presumably this is using the tis driver? > > > > > > > > Correct. > > > > > > > > > > As mentioned by the reporter all TPM2 commands are failing with: > > > > > > ERROR:tcti:src/tss2-tcti/tcti- > > > > > > device.c:290:tcti_device_receive() > > > > > > Failed to read response from fd 3, got errno 1: Operation not > > > > > > permitted > > > > > > > > > > > > The reporter bisected this issue back to this commit which was > > > > > > backported to stable as commit 4d6ebc4. > > > > > > > > > > I think the problem is request_locality ... for some inexplicable > > > > > reason a failure there returns -1, which is EPERM to user space. > > > > > > > > > > That seems to be a bug in the async code since everything else > > > > > gives a ESPIPE error if tpm_try_get_ops fails ... at least no-one > > > > > assumes it gives back a sensible return code. > > > > > > > > > > What I think is happening is that with the patch the TPM goes > > > > > through a quick sequence of request, relinquish, request, > > > > > relinquish and it's the third request which is failing (likely > > > > > timing out). Without the patch, the patch there's only one > > > > > request,relinquish cycle because the ops are held while the async > > > > > work is executed. I have a vague recollection that there is a > > > > > problem with too many locality request in quick succession, but > > > > > I'll defer to Jason, who I think understands the intricacies of > > > > > localities better than I do. > > > > > > > > Thanks, I don't pretend to understand the nuances of this particular > > > > code, but I was hoping that the request to revert got some attention > > > > since Alex's kernel Bugzilla and message a few months ago to linux > > > > integrity weren't. > > > > > > > > > If that's the problem, the solution looks simple enough: just move > > > > > the ops get down because the priv state is already protected by the > > > > > buffer mutex > > > > > > > > Yeah, if that works for Alex's situation it certainly sounds like a > > > > better solution than reverting this patch as this patch actually does > > > > fix a problem reported by Jeffrin originally. > > > > > > > > Could you propose a specific patch that Alex and Jeffrin can perhaps > > > > both try? > > > > > > Um, what's wrong with the one I originally attached and which you quote > > > below? It's only compile tested, but I think it will work, if the > > > theory is correct. > > > > Please send a legit patch, thanks. > > > > /Jarkko > > Jarkko, > > After the confirmation from Alex that this patch attached to the end of the thread > worked, James did send a proper patch that can be accessed here: > https://lore.kernel.org/linux-integrity/20200527155800.ya43xm2ltuwduwjg@cantor/T/#t > > Thanks, Hi thanks a lot! I did read the full discussions and agree with the conclusions as I get a patch in proper form. Please ping next time a bit earlier. It's not that I don't want to deal with the issues quickly as possible. It's probably just that I've forgot something or missed. /Jarkko