Re: [BUG report] WARNING of sysfs in __blk_mq_update_nr_hw_queues()

Nilay Shroff <nilay@xxxxxxxxxxxxx> · Fri, 28 Feb 2025 12:43:08 +0530



On 2/28/25 7:52 AM, Li Nan wrote:
> Hi,
> 
> In __blk_mq_update_nr_hw_queues(), we don't check the return value of
> blk_mq_sysfs_register_hctxs(). When sysfs creation fails, there's no
> proper error handling. This leads to a kernel warning during subsequent
> __blk_mq_update_nr_hw_queues() calls or disk removal:
> 
> ```
> kernfs: can not remove 'nr_tags', no directory
> WARNING: CPU: 2 PID: 805 at fs/kernfs/dir.c:1703 kernfs_remove_by_name_ns+0x12e/0x140
> Call Trace:
>  <TASK>
>  remove_files+0x39/0xb0
>  sysfs_remove_group+0x48/0xf0
>  sysfs_remove_groups+0x31/0x60
>  __kobject_del+0x23/0xf0
>  kobject_del+0x17/0x40
>  blk_mq_unregister_hctx+0x5d/0x80
>  blk_mq_sysfs_unregister_hctxs+0x89/0xd0
>  blk_mq_update_nr_hw_queues+0x31c/0x820
>  nullb_update_nr_hw_queues+0x71/0xe0 [null_blk]
>  nullb_device_submit_queues_store+0xa4/0x130 [null_blk]
> ```
> 
> Should we add error checking for blk_mq_sysfs_register_hctxs() and
> propagate the error to abort the update operation when it fails? This
> would prevent subsequent operations from hitting invalid sysfs entries.
> 
IMO, yes error checking should be added here. However it will be tricky
to undo everything as the error might have happened deep inside loop. We
need to carefully delete all sysfs objects added under each hctx->kobj.
BTW, typically, we don't abort the nr_hw_queue update operation but 
instead fallback to the previously configured number of hw queues.

Thanks,
--Nilay