Hi Jarkko, On 17.01.21 at 19:13, Jarkko Sakkinen wrote: > > I have hard time to believe that any of these patches are based on > actual regressions. > > /Jarko > patch 1 is indeed wrong (I oversaw the action call in case of error), so please ignore it. However patches 2 and 3 are based on bugs I encountered while working with TPM. I am sorry if I did not make the issues clear enough in the patches commit messages. Let me try to explain it in more detail: The bugs showed up after unloading the TPM chip driver module while one process still had the /dev/tpmrm device open. It is easy to reproduce: 1. open /dev/tpmrm* and keep it open 2. remove the tpm chip driver (in my case this was tpm_tis_spi) 3. try to write() to the still opened device /dev/tpmrm* This results in warnings like the following: Jan 24 14:01:20 raspberrypi kernel: ------------[ cut here ]------------ Jan 24 14:01:20 raspberrypi kernel: WARNING: CPU: 3 PID: 1161 at lib/refcount.c:25 kobject_get+0xa0/0xa4 Jan 24 14:01:20 raspberrypi kernel: refcount_t: addition on 0; use-after-free. Jan 24 14:01:20 raspberrypi kernel: Modules linked in: tpm_tis_spi tpm_tis_core tpm mdio_bcm_unimac brcmfmac sha256_generic libsha256 sha256_arm hci_uart btbcm bluetooth cfg80211 vc4 brcmutil ecdh_generic ecc snd_soc_core crc32_arm_ce libaes raspberrypi_hwmon ac97_bus snd_pcm_dmaengine bcm2711_thermal snd_pcm snd_timer genet snd phy_generic soundcore [last unloaded: spi_bcm2835] Jan 24 14:01:20 raspberrypi kernel: CPU: 3 PID: 1161 Comm: hold_open Not tainted 5.10.0ls-main-dirty #2 Jan 24 14:01:20 raspberrypi kernel: Hardware name: BCM2711 Jan 24 14:01:20 raspberrypi kernel: [<c0410c3c>] (unwind_backtrace) from [<c040b580>] (show_stack+0x10/0x14) Jan 24 14:01:20 raspberrypi kernel: [<c040b580>] (show_stack) from [<c1092174>] (dump_stack+0xc4/0xd8) Jan 24 14:01:20 raspberrypi kernel: [<c1092174>] (dump_stack) from [<c0445a30>] (__warn+0x104/0x108) Jan 24 14:01:20 raspberrypi kernel: [<c0445a30>] (__warn) from [<c0445aa8>] (warn_slowpath_fmt+0x74/0xb8) Jan 24 14:01:20 raspberrypi kernel: [<c0445aa8>] (warn_slowpath_fmt) from [<c08435d0>] (kobject_get+0xa0/0xa4) Jan 24 14:01:20 raspberrypi kernel: [<c08435d0>] (kobject_get) from [<bf0a715c>] (tpm_try_get_ops+0x14/0x54 [tpm]) Jan 24 14:01:20 raspberrypi kernel: [<bf0a715c>] (tpm_try_get_ops [tpm]) from [<bf0a7d6c>] (tpm_common_write+0x38/0x60 [tpm]) Jan 24 14:01:20 raspberrypi kernel: [<bf0a7d6c>] (tpm_common_write [tpm]) from [<c05a7ac0>] (vfs_write+0xc4/0x3c0) Jan 24 14:01:20 raspberrypi kernel: [<c05a7ac0>] (vfs_write) from [<c05a7ee4>] (ksys_write+0x58/0xcc) Jan 24 14:01:20 raspberrypi kernel: [<c05a7ee4>] (ksys_write) from [<c04001a0>] (ret_fast_syscall+0x0/0x4c) Jan 24 14:01:20 raspberrypi kernel: Exception stack(0xc226bfa8 to 0xc226bff0) Jan 24 14:01:20 raspberrypi kernel: bfa0: 00000000 000105b4 00000003 beafe664 00000014 00000000 Jan 24 14:01:20 raspberrypi kernel: bfc0: 00000000 000105b4 000103f8 00000004 00000000 00000000 b6f9c000 beafe684 Jan 24 14:01:20 raspberrypi kernel: bfe0: 0000006c beafe648 0001056c b6eb6944 Jan 24 14:01:20 raspberrypi kernel: ---[ end trace d4b8409def9b8b1f ]--- Jan 24 14:01:20 raspberrypi kernel: ------------[ cut here ]------------ Jan 24 14:01:20 raspberrypi kernel: WARNING: CPU: 3 PID: 1161 at lib/refcount.c:28 tpm_try_get_ops+0x4c/0x54 [tpm] Jan 24 14:01:20 raspberrypi kernel: refcount_t: underflow; use-after-free. Jan 24 14:01:20 raspberrypi kernel: Modules linked in: tpm_tis_spi tpm_tis_core tpm mdio_bcm_unimac brcmfmac sha256_generic libsha256 sha256_arm hci_uart btbcm bluetooth cfg80211 vc4 brcmutil ecdh_generic ecc snd_soc_core crc32_arm_ce libaes raspberrypi_hwmon ac97_bus snd_pcm_dmaengine bcm2711_thermal snd_pcm snd_timer genet snd phy_generic soundcore [last unloaded: spi_bcm2835] Jan 24 14:01:20 raspberrypi kernel: CPU: 3 PID: 1161 Comm: hold_open Tainted: G W 5.10.0ls-main-dirty #2 Jan 24 14:01:20 raspberrypi kernel: Hardware name: BCM2711 Jan 24 14:01:20 raspberrypi kernel: [<c0410c3c>] (unwind_backtrace) from [<c040b580>] (show_stack+0x10/0x14) Jan 24 14:01:20 raspberrypi kernel: [<c040b580>] (show_stack) from [<c1092174>] (dump_stack+0xc4/0xd8) Jan 24 14:01:20 raspberrypi kernel: [<c1092174>] (dump_stack) from [<c0445a30>] (__warn+0x104/0x108) Jan 24 14:01:20 raspberrypi kernel: [<c0445a30>] (__warn) from [<c0445aa8>] (warn_slowpath_fmt+0x74/0xb8) Jan 24 14:01:20 raspberrypi kernel: [<c0445aa8>] (warn_slowpath_fmt) from [<bf0a7194>] (tpm_try_get_ops+0x4c/0x54 [tpm]) Jan 24 14:01:20 raspberrypi kernel: [<bf0a7194>] (tpm_try_get_ops [tpm]) from [<bf0a7d6c>] (tpm_common_write+0x38/0x60 [tpm]) Jan 24 14:01:20 raspberrypi kernel: [<bf0a7d6c>] (tpm_common_write [tpm]) from [<c05a7ac0>] (vfs_write+0xc4/0x3c0) Jan 24 14:01:20 raspberrypi kernel: [<c05a7ac0>] (vfs_write) from [<c05a7ee4>] (ksys_write+0x58/0xcc) Jan 24 14:01:20 raspberrypi kernel: [<c05a7ee4>] (ksys_write) from [<c04001a0>] (ret_fast_syscall+0x0/0x4c) Jan 24 14:01:20 raspberrypi kernel: Exception stack(0xc226bfa8 to 0xc226bff0) Jan 24 14:01:20 raspberrypi kernel: bfa0: 00000000 000105b4 00000003 beafe664 00000014 00000000 Jan 24 14:01:20 raspberrypi kernel: bfc0: 00000000 000105b4 000103f8 00000004 00000000 00000000 b6f9c000 beafe684 Jan 24 14:01:20 raspberrypi kernel: bfe0: 0000006c beafe648 0001056c b6eb6944 Jan 24 14:01:20 raspberrypi kernel: ---[ end trace d4b8409def9b8b20 ]--- After some investigation I found out that the two warnings concerning a refcount underflow pop up due to an invalid chip->dev. Writing to /dev/tpmrm* eventually calls tpm_common_write() which makes use of tpm_try_get_ops(). The latter function tries to get a ref for chip->dev which at this time already has hit 0. As it turned out, the reference we lack here is exactly the one that is supposed to be taken in tpm_chip_alloc(): if (chip->flags & TPM_CHIP_FLAG_TPM2) get_device(&chip->dev); If you take a look at these lines and the code above you will figure that there is no program path in which TPM_CHIP_FLAG_TPM2 could ever be set at this time. So the reference is never taken, even if we are using TPM2 with /dev/tpmrm*. I tried to follow the history of this bug and it seems to start with commit fdc915f7f719 which introduced the support for /dev/tpmrm. This commit seemed to have already in mind that /dev/tpmrm could still be opened after the tpm chip driver has been unregistered. For this reason an extra reference to chip->dev was taken. See this excerpt of commit fdc915f7f719: <SNIP> + chip->devs.release = tpm_devs_release; + /* get extra reference on main device to hold on + * behalf of devs. This holds the chip structure + * while cdevs is in use. The corresponding put + * is in the tpm_devs_release + */ <SNAP> This reference was supposed to be freed in tpm_devs_release() as the comment says: <SNIP> +static void tpm_devs_release(struct device *dev) +{ + struct tpm_chip *chip = container_of(dev, struct tpm_chip, devs); + + /* release the master device reference */ + put_device(&chip->dev); +} <SNAP> However the code did not work as expected at this time, because the reference to chip->devs that we get with device_initalize() is never release. This prevents tpm_devs_release() from being called which as a consequnce misses the put on chip->dev we do there. This led to behaviour which commit 8979b02aaf1d tried to fix. See the message of this commit: " The main device is currently not properly released due to one additional reference to the 'devs' device which is only released in case of a TPM 2. So, also get the additional reference only in case of a TPM2." Actually the additional reference was _never_ released even not in TPM2 case (as described above). The right fix at this point would IMHO have been to make sure that chip->devs reference is put one more time to make sure that tpm_devs_release() is called and also chip->dev is put one more time. Instead 8979b02aaf1d chose another approach: + if (chip->flags & TPM_CHIP_FLAG_TPM2) + get_device(&chip->dev); was introduced to avoid getting the reference in the first place for the non-TPM2 case. This fixed the non-TPM2 case but now left the TPM2 case without an additional reference, too (since as stated above there is nothing that sets the needed flag to grap the ref for TPM2). So while before we had taken the ref in both cases TPM2 and non-TPM2 and never released it in either case (which was wrong), we now do not take it in neither case (which is also wrong). And this is what we have today. If we open /dev/tpmrm* and then unregister the chip all references to chip->dev are already gone. An attempt to write() to /dev/tpmrm* then results in the reference underflow warning, because we try to grab a ref to chip->dev in tpm_try_get_ops() while the chip-dev refcount already has hit 0. So what patch 2 is supposed to do is 1st. revert 8979b02aaf1d: - if (chip->flags & TPM_CHIP_FLAG_TPM2) - get_device(&chip->dev); + get_device(&chip->dev); to get the extra ref to chip->dev unconditionally as originally proposed by commit fdc915f7f7193. And 2nd. fix the actual issue of fdc915f7f7193 namely a missing put of chip->devs to eventually release the extra reference to the main device (aka. chip->dev). + rc = devm_add_action_or_reset(pdev, + (void (*)(void *)) put_device, + &chip->devs); I hope this makes patch 2 a bit clearer. Beside an older kernel I tested it with the mainline kernel and the patch fixes the above issues. However anyone is welcome to test it themselfes. If you are fine with this solution I am going to prepare a v2 which fixes the action handler part. Otherwise please let me know what your concerns are and I will try to address it. Now concerning patch 3: If after steps 1 - 3 above you do 4. close /dev/tpmrm* you will run into the next issue, which is unrelated to the reference count bug, namely a NULL pointer dereference: ---------------- tpmrm_release Unable to handle kernel NULL pointer dereference at virtual address 00000034 pgd = 22aa2cf3 [00000034] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM Modules linked in: tpm_tis_spi tpm_tis_core tpm ipt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat_ipv4 nf_nat br_netfilter bridge stp llc overlay cmac bnep hci_uart btbcm serdev bluetooth ecdh_generic ftdi_sio brcmfmac usbserial brcmutil sha256_generic cfg80211 rfkill raspberrypi_hwmon bcm2835_codec(C) hwmon bcm2835_v4l2(C) v4l2_mem2mem bcm2835_mmal_vchiq(C) v4l2_common videobuf2_dma_contig videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev vc_sm_cma(C) media evdev ipt_REJECT nf_reject_ipv4 iptable_filter gpio_wdt gpio_keys fixed mcp320x ad5446 industrialio uio_pdrv_genirq spidev uio ip6t_REJECT nf_reject_ipv6 xt_recent xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_limit ip6t_rt ip6table_filter ip6_tables i2c_dev ip_tables x_tables ipv6 [last unloaded: spi_bcm2835] CPU: 1 PID: 1953 Comm: iotedged Tainted: G WC 4.19.95-rt38-MAIN-v7+ #6 Hardware name: BCM2835 PC is at tpm_chip_start+0x1c/0xc0 [tpm] LR is at tpm2_del_space+0x34/0x88 [tpm] pc : [<7f98407c>] lr : [<7f988808>] psr: 60070013 sp : a5a53e88 ip : a5a53ea0 fp : a5a53e9c r10: aab63308 r9 : 00000008 r8 : b8fa60a0 r7 : bb40e210 r6 : b379f4d4 r5 : b379f000 r4 : b379f000 r3 : 00000000 r2 : b4c51000 r1 : 00000002 r0 : b379f000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5383d Table: 259bc06a DAC: 00000055 Process iotedged (pid: 1953, stack limit = 0x128dc27e) Stack: (0xa5a53e88 to 0xa5a54000) 3e80: bbf67078 b379f000 a5a53ec4 a5a53ea0 7f988808 7f98406c 3ea0: bbf66000 aab63300 b7fef578 bb40e210 b8fa60a0 00000008 a5a53edc a5a53ec8 3ec0: 7f98a7e8 7f9887e0 aab63300 00000000 a5a53f14 a5a53ee0 802f1890 7f98a7a8 3ee0: 00000000 00000000 8021add0 aab63300 b4c515e4 b4c51000 80ea6888 b4c51614 3f00: ba383200 000000f8 a5a53f24 a5a53f18 802f1a4c 802f1804 a5a53f4c a5a53f28 3f20: 8014444c 802f1a40 b4c51000 badfa880 a5a52000 a5a53f54 00000000 000005fc 3f40: a5a53f74 a5a53f50 80127794 801443b4 a5a53f94 a5a53f60 802f096c 80e07588 3f60: 00000000 000000f8 a5a53f94 a5a53f78 80128030 80127380 76c156d8 76c156d8 3f80: 00000000 000000f8 a5a53fa4 a5a53f98 801280d0 80127fec 00000000 a5a53fa8 3fa0: 80101000 801280bc 76c156d8 76c156d8 00000001 00000000 c6c82300 00000001 3fc0: 76c156d8 76c156d8 00000000 000000f8 00000002 00000000 76c1a140 76c17000 3fe0: 000000f8 7e960a64 76b96389 76b38746 60070030 00000001 00000000 00000000 [<7f98407c>] (tpm_chip_start [tpm]) from [<7f988808>] (tpm2_del_space+0x34/0x88 [tpm]) [<7f988808>] (tpm2_del_space [tpm]) from [<7f98a7e8>] (tpmrm_release+0x4c/0x5c [tpm]) [<7f98a7e8>] (tpmrm_release [tpm]) from [<802f1890>] (__fput+0x98/0x1e0) [<802f1890>] (__fput) from [<802f1a4c>] (____fput+0x18/0x1c) [<802f1a4c>] (____fput) from [<8014444c>] (task_work_run+0xa4/0xc0) [<8014444c>] (task_work_run) from [<80127794>] (do_exit+0x420/0xc20) [<80127794>] (do_exit) from [<80128030>] (do_group_exit+0x50/0xd0) [<80128030>] (do_group_exit) from [<801280d0>] (__wake_up_parent+0x0/0x30) [<801280d0>] (__wake_up_parent) from [<80101000>] (ret_fast_syscall+0x0/0x28) Exception stack(0xa5a53fa8 to 0xa5a53ff0) 3fa0: 76c156d8 76c156d8 00000001 00000000 c6c82300 00000001 3fc0: 76c156d8 76c156d8 00000000 000000f8 00000002 00000000 76c1a140 76c17000 3fe0: 000000f8 7e960a64 76b96389 76b38746 Code: e52de004 e8bd4000 e5903430 e1a04000 (e5932034) ---[ end trace 0000000000000006 ]--- AFAIU this is since at the time the file is closed and the files release() function is called the chips ops pointer is already NULL. It has been set to NULL at the time when the chip was unregistered (see tpm_del_char_device()). The fix here is to first check if the ops pointer is still valid. That should be true as long as the chip has not been unregistered. If the chip has been unregistered already the ops pointer wont be accessable any more. In this case skip flushing the sessions. However please ignore this patch for now, too, since while it works with an older kernel (4.19 is the one I am working with) it turned out not to be the correct solution for the mainline kernel. I would have to take a closer look for the mainline case and then send a patch as soon as I have one. Regards, Lino