On Wed, 2018-01-17 at 09:23 +0800, Ming Lei wrote: > On Wed, Jan 17, 2018 at 8:03 AM, Bart Van Assche <Bart.VanAssche@xxxxxxx> wrote: > > On Tue, 2018-01-16 at 17:32 -0500, Mike Snitzer wrote: > > > Therefore it seems to me that all queue_attr_{show,store} are racey vs > > > blk_unregister_queue() removing the 'queue' kobject. > > > > > > And it was just that __elevator_change() was myopicly fixed to address > > > the race whereas a more generic solution was/is needed. But short of > > > that more generic fix your change will reintroduce the potential for > > > hitting the issue that commit e9a823fb34a8b fixed. > > > > > > In that light, think it best to leave blk_unregister_queue()'s > > > mutex_lock() above the QUEUE_FLAG_REGISTERED clearing _and_ update > > > queue_attr_{show,store} to test for QUEUE_FLAG_REGISTERED while holding > > > sysfs_lock. > > > > > > Then remove the unicorn test_bit for QUEUE_FLAG_REGISTERED from > > > __elevator_change(). > > > > Thanks Mike for the feedback. However, I think a simpler approach exists than > > what has been described above, namely by unregistering the sysfs attributes > > earlier. How about the patch below? > > > > [PATCH] block: Protect less code with sysfs_lock in blk_{un,}register_queue() > > --- > > block/blk-sysfs.c | 39 ++++++++++++++++++++++++++------------- > > block/elevator.c | 4 ---- > > 2 files changed, 26 insertions(+), 17 deletions(-) > > > > diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c > > index 4a6a40ffd78e..ce32366c6db7 100644 > > --- a/block/blk-sysfs.c > > +++ b/block/blk-sysfs.c > > @@ -853,6 +853,10 @@ struct kobj_type blk_queue_ktype = { > > .release = blk_release_queue, > > }; > > > > +/** > > + * blk_register_queue - register a block layer queue with sysfs > > + * @disk: Disk of which the request queue should be registered with sysfs. > > + */ > > int blk_register_queue(struct gendisk *disk) > > { > > int ret; > > @@ -909,11 +913,12 @@ int blk_register_queue(struct gendisk *disk) > > if (q->request_fn || (q->mq_ops && q->elevator)) { > > ret = elv_register_queue(q); > > if (ret) { > > + mutex_unlock(&q->sysfs_lock); > > kobject_uevent(&q->kobj, KOBJ_REMOVE); > > kobject_del(&q->kobj); > > blk_trace_remove_sysfs(dev); > > kobject_put(&dev->kobj); > > - goto unlock; > > + return ret; > > } > > } > > ret = 0; > > @@ -923,6 +928,13 @@ int blk_register_queue(struct gendisk *disk) > > } > > EXPORT_SYMBOL_GPL(blk_register_queue); > > > > +/** > > + * blk_unregister_queue - counterpart of blk_register_queue() > > + * @disk: Disk of which the request queue should be unregistered from sysfs. > > + * > > + * Note: the caller is responsible for guaranteeing that this function is called > > + * after blk_register_queue() has finished. > > + */ > > void blk_unregister_queue(struct gendisk *disk) > > { > > struct request_queue *q = disk->queue; > > @@ -934,28 +946,29 @@ void blk_unregister_queue(struct gendisk *disk) > > if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)) > > return; > > > > - /* > > - * Protect against the 'queue' kobj being accessed > > - * while/after it is removed. > > - */ > > - mutex_lock(&q->sysfs_lock); > > - > > spin_lock_irq(q->queue_lock); > > queue_flag_clear(QUEUE_FLAG_REGISTERED, q); > > spin_unlock_irq(q->queue_lock); > > > > - wbt_exit(q); > > - > > + /* > > + * Remove the sysfs attributes before unregistering the queue data > > + * structures that can be modified through sysfs. > > + */ > > + mutex_lock(&q->sysfs_lock); > > if (q->mq_ops) > > blk_mq_unregister_dev(disk_to_dev(disk), q); > > - > > - if (q->request_fn || (q->mq_ops && q->elevator)) > > - elv_unregister_queue(q); > > + mutex_unlock(&q->sysfs_lock); > > > > kobject_uevent(&q->kobj, KOBJ_REMOVE); > > kobject_del(&q->kobj); > > elevator switch still can come just after the above line code is completed, > so the previous warning addressed in e9a823fb34a8b can be triggered > again. > > > blk_trace_remove_sysfs(disk_to_dev(disk)); > > - kobject_put(&disk_to_dev(disk)->kobj); > > > > + wbt_exit(q); > > + > > + mutex_lock(&q->sysfs_lock); > > + if (q->request_fn || (q->mq_ops && q->elevator)) > > + elv_unregister_queue(q); > > mutex_unlock(&q->sysfs_lock); > > + > > + kobject_put(&disk_to_dev(disk)->kobj); > > } > > diff --git a/block/elevator.c b/block/elevator.c > > index e87e9b43aba0..4b7957b28a99 100644 > > --- a/block/elevator.c > > +++ b/block/elevator.c > > @@ -1080,10 +1080,6 @@ static int __elevator_change(struct request_queue *q, const char *name) > > char elevator_name[ELV_NAME_MAX]; > > struct elevator_type *e; > > > > - /* Make sure queue is not in the middle of being removed */ > > - if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)) > > - return -ENOENT; > > - > > The above check shouldn't be removed, as I explained above. Hello Ming, Sorry but I think what you wrote is wrong. kobject_del(&q->kobj) waits until all ongoing sysfs callback methods, including elv_iosched_store(), have finished and prevents that any new elv_iosched_store() calls start. That is why I think the above changes do not reintroduce the race fixed by commit e9a823fb34a8 ("block: fix warning when I/O elevator is changed as request_queue is being removed"). Bart.