On Fri, Jul 10, 2020 at 4:40 AM Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote: > > On Thu, Jul 09, 2020 at 05:22:09PM -0700, Andrey Pronin wrote: > > This patch prevents NULL dereferencing when using chip->ops while > > sending TPM2_Shutdown command if both tpm_class_shutdown handler and > > tpm_del_char_device are called during system shutdown. > > > > Both these handlers set chip->ops to NULL but don't check if it's > > already NULL when they are called before using it. > > > > This issue was revealed in Chrome OS after a recent set of changes > > to the unregister order for spi controllers, such as: > > b4c6230bb0ba spi: Fix controller unregister order > > f40913d2dca1 spi: pxa2xx: Fix controller unregister order > > and similar for other controllers. > > I'm not sure I fully understand the scenario. When does thi happen? This happens during system shutdown. Here a sample stack trace from the panic: BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 ... Call Trace: tpm_transmit_cmd+0x21/0x7f tpm2_shutdown+0x84/0xc6 tpm_chip_unregister+0xa2/0xb9 cr50_spi_remove+0x19/0x26 spi_drv_remove+0x2a/0x42 device_release_driver_internal+0x123/0x1ec bus_remove_device+0xe8/0x111 device_del+0x1bf/0x319 ? spi_unregister_controller+0xfc/0xfc device_unregister+0x12/0x28 __unregister+0xe/0x12 device_for_each_child+0x79/0xb8 spi_unregister_controller+0x27/0xfc pxa2xx_spi_remove+0x45/0xb4 device_shutdown+0x181/0x1d3 kernel_restart+0x12/0x56 SyS_reboot+0x16a/0x1e7 do_syscall_64+0x6b/0xf7 entry_SYSCALL_64_after_hwframe+0x41/0xa6 > Why does not tpm_del_char_device need this? "Not" is a typo in the sentence above, right? tpm_del_char_device *does* need the fix. When tpm_class_shutdown is called it sets chip->ops to NULL. If tpm_del_char_device is called after that, it doesn't check if chip->ops is NULL (normal kernel API and char device API calls go through tpm_try_get_ops, but tpm_del_char_device doesn't) and proceeds to call tpm2_shutdown(), which tries sending the command and dereferences chip->ops. > The changes listed tell > me nothing. Why they have this effect? "spi: pxa2xx: Fix controller unregister order" adds a spi_unregister_controller call to pxa2xx_spi_remove, which internally calls tpm_del_char_device, which results in the stack trace leading to the panic above. > > I'm just trying to understand whether this could be a regression or > not. > > I neither understand what you mean by "and similar for other > controllers." There was a series of spi unregister order changes for various spi controllers, including the following: 1c6221b430a0 spi: bcm2835: Fix controller unregister order f40913d2dca1 spi: pxa2xx: Fix controller unregister order b4c6230bb0ba spi: Fix controller unregister order 54000d2e15e9 spi: dw: Fix controller unregister order c8f309db490e spi: bcm2835aux: Fix controller unregister order > > NAK for the reason that I don't understand what I'm merging. > > /Jarkko > > > > > Signed-off-by: Andrey Pronin <apronin@xxxxxxxxxxxx> > > --- > > drivers/char/tpm/tpm-chip.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c > > index 8c77e88012e9..a410ca40a3c5 100644 > > --- a/drivers/char/tpm/tpm-chip.c > > +++ b/drivers/char/tpm/tpm-chip.c > > @@ -296,7 +296,7 @@ static int tpm_class_shutdown(struct device *dev) > > struct tpm_chip *chip = container_of(dev, struct tpm_chip, dev); > > > > down_write(&chip->ops_sem); > > - if (chip->flags & TPM_CHIP_FLAG_TPM2) { > > + if (chip->ops && (chip->flags & TPM_CHIP_FLAG_TPM2)) { > > if (!tpm_chip_start(chip)) { > > tpm2_shutdown(chip, TPM2_SU_CLEAR); > > tpm_chip_stop(chip); > > @@ -479,7 +479,7 @@ static void tpm_del_char_device(struct tpm_chip *chip) > > > > /* Make the driver uncallable. */ > > down_write(&chip->ops_sem); > > - if (chip->flags & TPM_CHIP_FLAG_TPM2) { > > + if (chip->ops && (chip->flags & TPM_CHIP_FLAG_TPM2)) { > > if (!tpm_chip_start(chip)) { > > tpm2_shutdown(chip, TPM2_SU_CLEAR); > > tpm_chip_stop(chip); > > -- > > 2.25.1 > > -- Andrey