Re: [Patch v3 2/3] Drivers: hv: add Azure Blob driver

Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> · Fri, 16 Jul 2021 19:28:06 +0200

On Fri, Jul 16, 2021 at 03:56:14PM +0000, Michael Kelley wrote:
> > +static int az_blob_remove(struct hv_device *dev)
> > +{
> > +	az_blob_dev.removing = true;
> > +	/*
> > +	 * RCU lock guarantees that any future calls to az_blob_fop_open()
> > +	 * can not use device resources while the inode reference of
> > +	 * /dev/azure_blob is being held for that open, and device file is
> > +	 * being removed from /dev.
> > +	 */
> > +	synchronize_rcu();
> 
> I don't think this works as you have described.  While it will ensure that
> any in-progress RCU read-side critical sections have completed (i.e.,
> in az_blob_fop_open() ), it does not prevent new read-side critical sections
> from being started.  So as soon as synchronize_rcu() is finished, another
> thread could find and open the device, and be executing in
> az_blob_fop_open().
> 
> But it's not clear to me that this (and the rcu_read_locks in az_blob_fop_open)
> are really needed anyway.  If az_blob_remove_device() is called while one or more
> threads have it open, I think that's OK.  Or is there a scenario that I'm missing?

This should not be different from any other tiny character device, why
the mess with RCU at all?

> > +	az_blob_info("removing device\n");
> > +	az_blob_remove_device();
> > +
> > +	/*
> > +	 * At this point, it's not possible to open more files.
> > +	 * Wait for all the opened files to be released.
> > +	 */
> > +	wait_event(az_blob_dev.file_wait, list_empty(&az_blob_dev.file_list));
> 
> As mentioned in my most recent comments on the previous version of the
> code, I'm concerned about waiting for all open files to be released in the case
> of a VMbus rescind.  We definitely have to wait for all VSP operations to
> complete, but that's different from waiting for the files to be closed.  The former
> depends on Hyper-V being well-behaved and will presumably happen quickly
> in the case of a rescind.  But the latter depends entirely on user space code
> that is out of the kernel's control.  The user space process could hang around
> for hours or days with the file still open, which would block this function
> from completing, and hence block the global work queue.
> 
> Thinking about this, the core issue may be that having a single static
> instance of az_blob_device is problematic if we allow the device to be
> removed (either explicitly with an unbind, or implicitly with a VMbus
> rescind) and then re-added.  Perhaps what needs to happen is that
> the removed device is allowed to continue to exist as long as user
> space processes have an open file handle, but they can't perform
> any operations because the "removing" flag is set (and stays set).
> If the device is re-added, then a new instance of az_blob_device
> should be created, and whether or not the old instance is still
> hanging around is irrelevant.

You should never have a single static copy of the device, that was going
to be my first review comment once this all actually got to a place that
made sense to review (which it is not even there yet.)  When you do
that, then you have these crazy race issues you speak of.  Use the misc
api correctly and you will not have any of these problems, why people
try to make it harder is beyond me...

thanks,

greg k-h