From: Vladis Dronov <vdronov@xxxxxxxxxx> Date: Fri, 27 Dec 2019 03:26:27 +0100 > In a case when a ptp chardev (like /dev/ptp0) is open but an underlying > device is removed, closing this file leads to a race. This reproduces > easily in a kvm virtual machine: . .. > This happens in: > > static void __fput(struct file *file) > { ... > if (file->f_op->release) > file->f_op->release(inode, file); <<< cdev is kfree'd here > if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL && > !(mode & FMODE_PATH))) { > cdev_put(inode->i_cdev); <<< cdev fields are accessed here > > Namely: > > __fput() > posix_clock_release() > kref_put(&clk->kref, delete_clock) <<< the last reference > delete_clock() > delete_ptp_clock() > kfree(ptp) <<< cdev is embedded in ptp > cdev_put > module_put(p->owner) <<< *p is kfree'd, bang! > > Here cdev is embedded in posix_clock which is embedded in ptp_clock. > The race happens because ptp_clock's lifetime is controlled by two > refcounts: kref and cdev.kobj in posix_clock. This is wrong. > > Make ptp_clock's sysfs device a parent of cdev with cdev_device_add() > created especially for such cases. This way the parent device with its > ptp_clock is not released until all references to the cdev are released. > This adds a requirement that an initialized but not exposed struct > device should be provided to posix_clock_register() by a caller instead > of a simple dev_t. > > This approach was adopted from the commit 72139dfa2464 ("watchdog: Fix > the race between the release of watchdog_core_data and cdev"). See > details of the implementation in the commit 233ed09d7fda ("chardev: add > helper function to register char devs with a struct device"). > > Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@xxxxxxxxxx/T/#u > Analyzed-by: Stephen Johnston <sjohnsto@xxxxxxxxxx> > Analyzed-by: Vern Lovejoy <vlovejoy@xxxxxxxxxx> > Signed-off-by: Vladis Dronov <vdronov@xxxxxxxxxx> Applied, thanks.