On Fri, Aug 23, 2019 at 09:46:48AM -0700, Bart Van Assche wrote: > On 8/21/19 2:15 AM, Ming Lei wrote: > > @@ -966,7 +966,7 @@ int blk_register_queue(struct gendisk *disk) > > return ret; > > /* Prevent changes through sysfs until registration is completed. */ > > - mutex_lock(&q->sysfs_lock); > > + mutex_lock(&q->sysfs_dir_lock); > > ret = kobject_add(&q->kobj, kobject_get(&dev->kobj), "%s", "queue"); > > if (ret < 0) { > > @@ -987,26 +987,37 @@ int blk_register_queue(struct gendisk *disk) > > blk_mq_debugfs_register(q); > > } > > - kobject_uevent(&q->kobj, KOBJ_ADD); > > - > > - wbt_enable_default(q); > > - > > - blk_throtl_register_queue(q); > > - > > + /* > > + * The queue's kobject ADD uevent isn't sent out, also the > > + * flag of QUEUE_FLAG_REGISTERED isn't set yet, so elevator > > + * switch won't happen at all. > > + */ > > if (q->elevator) { > > - ret = elv_register_queue(q); > > + ret = elv_register_queue(q, false); > > if (ret) { > > The above changes seems risky to me. In contrast with what the comment > suggests, user space code is not required to wait for KOBJ_ADD event to > start using sysfs attributes. I think user space code *can* write into the > request queue I/O scheduler sysfs attribute after the kobject_add() call has > finished and before kobject_uevent(&q->kobj, KOBJ_ADD) is called. Yeah, one crazy userspace may simply poll on sysfs entres and start to READ/WRITE before seeing the KOBJ_ADD event. However, we have another protection via the queue flag QUEUE_FLAG_REGISTERED, which is set after everything is done. So if userspace's early store comes, elevator switch still can't happen because the flag is checked in __elevator_change(). thanks, Ming