Re: [PATCH] nbd: add a flush_workqueue in nbd_start_device

Jens Axboe <axboe@xxxxxxxxx> · Tue, 21 Jan 2020 14:25:17 -0700

On 1/21/20 7:00 AM, Josef Bacik wrote:
> On 1/21/20 7:48 AM, Sun Ke wrote:
>> When kzalloc fail, may cause trying to destroy the
>> workqueue from inside the workqueue.
>>
>> If num_connections is m (2 < m), and NO.1 ~ NO.n
>> (1 < n < m) kzalloc are successful. The NO.(n + 1)
>> failed. Then, nbd_start_device will return ENOMEM
>> to nbd_start_device_ioctl, and nbd_start_device_ioctl
>> will return immediately without running flush_workqueue.
>> However, we still have n recv threads. If nbd_release
>> run first, recv threads may have to drop the last
>> config_refs and try to destroy the workqueue from
>> inside the workqueue.
>>
>> To fix it, add a flush_workqueue in nbd_start_device.
>>
>> Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
>> Signed-off-by: Sun Ke <sunke32@xxxxxxxxxx>
>> ---
>>   drivers/block/nbd.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index b4607dd96185..dd1f8c2c6169 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -1264,7 +1264,12 @@ static int nbd_start_device(struct nbd_device *nbd)
>>   
>>   		args = kzalloc(sizeof(*args), GFP_KERNEL);
>>   		if (!args) {
>> -			sock_shutdown(nbd);
>> +			if (i == 0)
>> +				sock_shutdown(nbd);
>> +			else {
>> +				sock_shutdown(nbd);
>> +				flush_workqueue(nbd->recv_workq);
>> +			}
> 
> Just for readability sake why don't we just flush_workqueue()
> unconditionally, and add a comment so we know why in the future.

Or maybe just make it:

	sock_shutdown(nbd);
	if (i)
		flush_workqueue(nbd->recv_workq);

which does the same thing, but is still readable. The current code with
the shutdown duplication is just a bit odd. Needs a comment either way.

-- 
Jens Axboe