From: Long Li <longli@xxxxxxxxxxxxx> Sent: Friday, July 2, 2021 4:59 PM > > > > [snip] > > > > > > > +static void az_blob_remove_device(struct az_blob_device *dev) { > > > > > + wait_event(dev->file_wait, list_empty(&dev->file_list)); > > > > > + misc_deregister(&az_blob_misc_device); > > > > > +#ifdef CONFIG_DEBUG_FS > > > > > + debugfs_remove_recursive(az_blob_debugfs_root); > > > > > +#endif > > > > > + /* At this point, we won't get any requests from user-mode */ } > > > > > + > > > > > +static int az_blob_create_device(struct az_blob_device *dev) { > > > > > + int rc; > > > > > + struct dentry *d; > > > > > + > > > > > + rc = misc_register(&az_blob_misc_device); > > > > > + if (rc) { > > > > > + az_blob_err("misc_register failed rc %d\n", rc); > > > > > + return rc; > > > > > + } > > > > > + > > > > > +#ifdef CONFIG_DEBUG_FS > > > > > + az_blob_debugfs_root = debugfs_create_dir("az_blob", NULL); > > > > > + if (!IS_ERR_OR_NULL(az_blob_debugfs_root)) { > > > > > + d = debugfs_create_file("pending_requests", 0400, > > > > > + az_blob_debugfs_root, NULL, > > > > > + &az_blob_debugfs_fops); > > > > > + if (IS_ERR_OR_NULL(d)) { > > > > > + az_blob_warn("failed to create debugfs file\n"); > > > > > + debugfs_remove_recursive(az_blob_debugfs_root); > > > > > + az_blob_debugfs_root = NULL; > > > > > + } > > > > > + } else > > > > > + az_blob_warn("failed to create debugfs root\n"); #endif > > > > > + > > > > > + return 0; > > > > > +} > > > > > + > > > > > +static int az_blob_connect_to_vsp(struct hv_device *device, u32 > > > > > +ring_size) { > > > > > + int ret; > > > > > + > > > > > + spin_lock_init(&az_blob_dev.file_lock); > > > > > > > > I'd argue that the spin lock should not be re-initialized here. > > > > Here's the sequence where things go wrong: > > > > > > > > 1) The driver is unbound, so az_blob_remove() is called. > > > > 2) az_blob_remove() sets the "removing" flag to true, and calls > > > > az_blob_remove_device(). > > > > 3) az_blob_remove_device() waits for the file_list to become empty. > > > > 4) After the file_list becomes empty, but before misc_deregister() > > > > is called, a separate thread opens the device again. > > > > 5) In the separate thread, az_blob_fop_open() obtains the file_lock spin > > lock. > > > > 6) Before az_blob_fop_open() releases the spin lock, > > > > az_blob_remove_device() completes, along with az_blob_remove(). > > > > 7) Then the device gets rebound, and az_blob_connect_to_vsp() gets > > > > called, all while az_blob_fop_open() still holds the spin lock. So > > > > the spin lock get re- initialized while it is held. > > > > > > > > This is admittedly a far-fetched scenario, but stranger things have > > > > happened. :-) The issue is that you are counting on the az_blob_dev > > > > structure to persist and have a valid file_lock, even while the > > > > device is unbound. So any initialization should only happen in > > az_blob_drv_init(). > > > > > > I'm not sure if az_blob_probe() and az_blob_remove() can be called at > > > the same time, as az_blob_remove_vmbus() is called the last in > > az_blob_remove(). > > > Is it possible for vmbus asking the driver to probe a new channel > > > before the old channel is closed? I expect the vmbus provide guarantee > > > that those calls are made in sequence. > > > > In my scenario above, az_blob_remove_vmbus() and az_blob_remove() run > > to completion in Step #6, all while some other thread is still in the middle of > > an > > open() call and holding the file_lock spin lock. Then in Step #7 > > az_blob_probe() runs. So az_blob_remove() and az_blob_probe() execute > > sequentially, not at the same time. > > > > Michael > > I think it's a valid scenario. The return value of devtmpfs_delete_node() is > not checked in misc_deregister(). It decreases the refcount on inodes but it's > not guaranteed that someone else is still using it (in the middle of opening a file). > > However, this works fine for "rmmod" that causes device to be removed. > Before file is opened the refcount on the module is increased so it can't be > removed when file is being opened. The scenario you described can't happen. > > But during VMBUS rescind, it can happen. It's possible that the driver is using > the spinlock that has been re-initialized, when the next VMBUS offer on the > same channel comes before all the attempting open file calls exit. In my scenario, Step #1 is an unbind operation, not a module removal. But you make a valid point about VMbus rescind, which has the same effect as unbind. > > This is a very rare. I agree things happen that we should make sure the driver > can handle this. I'll update the driver. Sounds good. Michael