RE: [Patch v2 2/3] Drivers: hv: add Azure Blob driver

Michael Kelley <mikelley@xxxxxxxxxxxxx> · Sat, 3 Jul 2021 14:37:06 +0000

From: Long Li <longli@xxxxxxxxxxxxx> Sent: Friday, July 2, 2021 4:59 PM
> >
> > [snip]
> >
> > > > > +static void az_blob_remove_device(struct az_blob_device *dev) {
> > > > > +	wait_event(dev->file_wait, list_empty(&dev->file_list));
> > > > > +	misc_deregister(&az_blob_misc_device);
> > > > > +#ifdef CONFIG_DEBUG_FS
> > > > > +	debugfs_remove_recursive(az_blob_debugfs_root);
> > > > > +#endif
> > > > > +	/* At this point, we won't get any requests from user-mode */ }
> > > > > +
> > > > > +static int az_blob_create_device(struct az_blob_device *dev) {
> > > > > +	int rc;
> > > > > +	struct dentry *d;
> > > > > +
> > > > > +	rc = misc_register(&az_blob_misc_device);
> > > > > +	if (rc) {
> > > > > +		az_blob_err("misc_register failed rc %d\n", rc);
> > > > > +		return rc;
> > > > > +	}
> > > > > +
> > > > > +#ifdef CONFIG_DEBUG_FS
> > > > > +	az_blob_debugfs_root = debugfs_create_dir("az_blob", NULL);
> > > > > +	if (!IS_ERR_OR_NULL(az_blob_debugfs_root)) {
> > > > > +		d = debugfs_create_file("pending_requests", 0400,
> > > > > +			az_blob_debugfs_root, NULL,
> > > > > +			&az_blob_debugfs_fops);
> > > > > +		if (IS_ERR_OR_NULL(d)) {
> > > > > +			az_blob_warn("failed to create debugfs file\n");
> > > > > +			debugfs_remove_recursive(az_blob_debugfs_root);
> > > > > +			az_blob_debugfs_root = NULL;
> > > > > +		}
> > > > > +	} else
> > > > > +		az_blob_warn("failed to create debugfs root\n"); #endif
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +static int az_blob_connect_to_vsp(struct hv_device *device, u32
> > > > > +ring_size) {
> > > > > +	int ret;
> > > > > +
> > > > > +	spin_lock_init(&az_blob_dev.file_lock);
> > > >
> > > > I'd argue that the spin lock should not be re-initialized here.
> > > > Here's the sequence where things go wrong:
> > > >
> > > > 1) The driver is unbound, so az_blob_remove() is called.
> > > > 2) az_blob_remove() sets the "removing" flag to true, and calls
> > > > az_blob_remove_device().
> > > > 3) az_blob_remove_device() waits for the file_list to become empty.
> > > > 4) After the file_list becomes empty, but before misc_deregister()
> > > > is called, a separate thread opens the device again.
> > > > 5) In the separate thread, az_blob_fop_open() obtains the file_lock spin
> > lock.
> > > > 6) Before az_blob_fop_open() releases the spin lock,
> > > > az_blob_remove_device() completes, along with az_blob_remove().
> > > > 7) Then the device gets rebound, and az_blob_connect_to_vsp() gets
> > > > called, all while az_blob_fop_open() still holds the spin lock.  So
> > > > the spin lock get re- initialized while it is held.
> > > >
> > > > This is admittedly a far-fetched scenario, but stranger things have
> > > > happened. :-)  The issue is that you are counting on the az_blob_dev
> > > > structure to persist and have a valid file_lock, even while the
> > > > device is unbound.  So any initialization should only happen in
> > az_blob_drv_init().
> > >
> > > I'm not sure if az_blob_probe() and az_blob_remove() can be called at
> > > the same time, as az_blob_remove_vmbus() is called the last in
> > az_blob_remove().
> > > Is it possible for vmbus asking the driver to probe a new channel
> > > before the old channel is closed? I expect the vmbus provide guarantee
> > > that those calls are made in sequence.
> >
> > In my scenario above, az_blob_remove_vmbus() and az_blob_remove() run
> > to completion in Step #6, all while some other thread is still in the middle of
> > an
> > open() call and holding the file_lock spin lock.  Then in Step #7
> > az_blob_probe() runs.  So az_blob_remove() and az_blob_probe() execute
> > sequentially, not at the same time.
> >
> > Michael
> 
> I think it's a valid scenario.  The return value of devtmpfs_delete_node() is
> not checked in misc_deregister(). It decreases the refcount on inodes but it's
> not guaranteed that someone else is still using it (in the middle of opening a file).
> 
> However, this works fine for "rmmod" that causes device to be removed.
> Before file is opened the refcount on the module is increased so it can't be
> removed when file is being opened. The scenario you described can't happen.
> 
> But during VMBUS rescind, it can happen. It's possible that the driver is using
> the spinlock that has been re-initialized, when the next VMBUS offer on the
>  same channel comes before all the attempting open file calls exit.

In my scenario, Step #1 is an unbind operation, not a module removal.  But
you make a valid point about VMbus rescind, which has the same effect as
unbind.

> 
> This is a very rare. I agree things happen that we should make sure the driver
> can handle this. I'll update the driver.

Sounds good.

Michael