Re: Kernel 5.0 regression in /dev/tpm0 access

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2019-03-09 at 14:01 -0800, James Bottomley wrote:
> On Sat, 2019-03-09 at 22:48 +0200, Mantas Mikulėnas wrote:
[...]
> openat(AT_FDCWD, "/dev/tpmrm0", O_RDWR) = 3
> write(3,
> "\200\1\0\0\0\26\0\0\1z\0\0\0\0\0\0\0\0\0\0\0@", 22) = 22
> read(3,
> "\200\1\0\0\0\235\0\0\0\0\0\0\0\0\0\0\0\0\27\0\1\0\0\0\t\0\4\0\0\0\4\
> 0"
> ..., 4096) = 157
> close(3)
> 
> So we do a simple write command and read the return (which simply
> hangs until the TPM is ready with the data).  We don't poll like your
> application does above, so it seems obvious that the break must be in
> the polling code.

OK, so the polled sequence should be 

write()
poll()
read()

So I think this condition in tpm_common_poll is the problem:

	if (!priv->response_read || priv->response_length)
		mask = EPOLLIN | EPOLLRDNORM;

If something wakes poll_wait() before the command returns, that
condition is true because we set response_read to false in write().  So
I think poll_wait() is returning prematurely.

The reason you don't often see the problem under tracing is that if the
queued work has time to execute *before* poll returns, it's taken the
mutex and the read() will block until the command completes trying to
acquire the mutex.  If you're fast enough, the queue doesn't run, the
mutex isn't taken and read acquires it and returns with no data.

I think the fix may be to make poll only return POLLIN if we have a
response_length, so

	if (priv->response_length)
			mask = EPOLLIN | EPOLLRDNORM;

That way the calling program will get POLLOUT and go back to re-polling 
until we have data.

James




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux Kernel]     [Linux Kernel Hardening]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux