On Fri, Jul 10, 2020 at 12:08 PM James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, 2020-07-09 at 17:22 -0700, Andrey Pronin wrote: > > This patch prevents NULL dereferencing when using chip->ops while > > sending TPM2_Shutdown command if both tpm_class_shutdown handler and > > tpm_del_char_device are called during system shutdown. > > > > Both these handlers set chip->ops to NULL but don't check if it's > > already NULL when they are called before using it. > > > > This issue was revealed in Chrome OS after a recent set of changes > > to the unregister order for spi controllers, such as: > > b4c6230bb0ba spi: Fix controller unregister order > > f40913d2dca1 spi: pxa2xx: Fix controller unregister order > > and similar for other controllers. > > > > Signed-off-by: Andrey Pronin <apronin@xxxxxxxxxxxx> > > --- > > drivers/char/tpm/tpm-chip.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm- > > chip.c > > index 8c77e88012e9..a410ca40a3c5 100644 > > --- a/drivers/char/tpm/tpm-chip.c > > +++ b/drivers/char/tpm/tpm-chip.c > > @@ -296,7 +296,7 @@ static int tpm_class_shutdown(struct device *dev) > > struct tpm_chip *chip = container_of(dev, struct tpm_chip, > > dev); > > > > down_write(&chip->ops_sem); > > - if (chip->flags & TPM_CHIP_FLAG_TPM2) { > > + if (chip->ops && (chip->flags & TPM_CHIP_FLAG_TPM2)) { > > if (!tpm_chip_start(chip)) { > > tpm2_shutdown(chip, TPM2_SU_CLEAR); > > tpm_chip_stop(chip); > > @@ -479,7 +479,7 @@ static void tpm_del_char_device(struct tpm_chip > > *chip) > > > > /* Make the driver uncallable. */ > > down_write(&chip->ops_sem); > > - if (chip->flags & TPM_CHIP_FLAG_TPM2) { > > + if (chip->ops && (chip->flags & TPM_CHIP_FLAG_TPM2)) { > > if (!tpm_chip_start(chip)) { > > tpm2_shutdown(chip, TPM2_SU_CLEAR); > > tpm_chip_stop(chip); > > I really don't think this is the right fix. The problem is that these > two functions are trying to open code tpm_try_get_ops/tpm_put_ops (only > really for the tpm2 shutdown) because they want to NULL out the ops > before final mutex unlock. The problem with the current open coding is > it doesn't shut down the clock if required (not really a problem for > shutdown, but might cause issues for simple rmmod). I think this is > because no-one noticed the open coding when get/put was updated. > > This code should all be abstracted into a single function and shared > with tpm_try_get_ops/tpm_put_ops so we can't have this happen in > future. Possibly there should be a chip shutdown function which is > only active for TPM2 which does the correct try_get/shutdown/put > operation and then a separate simple get mutex, null ops, put mutex one > that's guaranteed to be called last. Yes, went for a minimal patch here to stop kernel panics, didn't try to refactor. Note that we do hold chip->ops_sem in both cases, and it's a write-lock, not a read-lock (as tpm_try_get_ops uses) since we are changing chip->ops. Thanks to this write-lock there, shouldn't be parallel operations that use chip->ops (so not locking chip->tpm_mutex shouldn't affect it). So, if I understand the idea right, can refactor to something like: 1) extract common code between tpm_del_char_device and tpm_class_shutdown into a shared method; 2) further extract the part between up/down(chip->ops_sem) to be re-used between tpm_try_get_ops/tpm_put_ops and this flow; 3) still have down_write/up_write in this flow vs get/put_device + down_read/up_read in tpm_try_get_ops case. Please let me know if that's a bad idea. Will be unavailable next week, but will continue after that. > > James > -- Andrey