Re: [PATCH 2/2] block: avoid acquiring q->sysfs_lock while accessing sysfs attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2/6/25 7:37 PM, Christoph Hellwig wrote:
> On Thu, Feb 06, 2025 at 07:24:02PM +0530, Nilay Shroff wrote:
>> Yes that's possible and so I audited all sysfs attributes which are 
>> currently protected using q->sysfs_lock and I found some interesting
>> facts. Please find below:
>>
>> 1. io_poll:
>>    Write to this attribute is ignored. So, we don't need q->sysfs_lock.
>>
>> 2. io_poll_delay:
>>    Write to this attribute is NOP, so we don't need q->sysfs_lock.
> 
> Yes, those are easy.
> 
>> 3. io_timeout:
>>    Write to this attribute updates q->rq_timeout and read of this attribute
>>    returns the value stored in q->rq_timeout Moreover, the q->rq_timeout is
>>    set only once when we init the queue (under blk_mq_init_allocated_queue())
>>    even before disk is added. So that means that we may not need to protect
>>    it with q->sysfs_lock.
> 
> Are we sure blk_queue_rq_timeout is never called from anything but
> probe?  Either way given that this is a limit that isn't directly
> corelated with anything else simply using WRITE_ONCE/READ_ONCE might
> be enough.
I just again grep source code and confirmed that blk_queue_rq_timeout is 
mostly called from the driver probe method (barring nbd driver which may
set q->rq_timeout using ioctl). And yes, agreed, q->rq_timeout can be
accessed using WRITE_ONCE/READ_ONCE.

>>
>> 4. nomerges:
>>    Write to this attribute file updates two q->flags : QUEUE_FLAG_NOMERGES 
>>    and QUEUE_FLAG_NOXMERGES. These flags are accessed during bio-merge which
>>    anyways doesn't run with q->sysfs_lock held. Moreover, the q->flags are 
>>    updated/accessed with bitops which are atomic. So, I believe, protecting
>>    it with q->sysfs_lock is not necessary.
> 
> Yes.
> 
>> 5. nr_requests:
>>    Write to this attribute updates the tag sets and this could potentially
>>    race with __blk_mq_update_nr_hw_queues(). So I think we should really 
>>    protect it with q->tag_set->tag_list_lock instead of q->sysfs_lock.
> 
> Yeah.
> 
>> 6. read_ahead_kb:
>>    Write to this attribute file updates disk->bdi->ra_pages. The disk->bdi->
>>    ra_pages is also updated under queue_limits_commit_update() which runs 
>>    holding q->limits_lock; so I think this attribute file should be protected
>>    with q->limits_lock and protecting it with q->sysfs_lock is not necessary. 
>>    Maybe we should move it under the same sets of attribute files which today
>>    runs with q->limits_lock held.
> 
> Yes, limits_lock sounds sensible here.
> 
>> 7. rq_affinity:
>>    Write to this attribute file makes atomic updates to q->flags: QUEUE_FLAG_SAME_COMP
>>    and QUEUE_FLAG_SAME_FORCE. These flags are also accessed from blk_mq_complete_need_ipi()
>>    using test_bit macro. As read/write to q->flags uses bitops which are atomic, 
>>    protecting it with q->stsys_lock is not necessary.
> 
> Agreed.  Although updating both flags isn't atomic that should be
> harmless in this case (but could use a comment about that).
Sure will add comment about this.
> 
>> 8. scheduler:
>>    Write to this attribute actually updates q->elevator and the elevator change/switch 
>>    code expects that the q->sysfs_lock is held while we update the iosched to protect 
>>    against the simultaneous __blk_mq_update_nr_hw_queues update. So yes, this field needs 
>>    q->sysfs_lock protection.
>>
>>    However if we're thinking of protecting sched change/update using q->tag_sets->
>>    tag_list_lock (as discussed in another thread), then we may use q->tag_set->
>>    tag_list_lock instead of q->sysfs_lock here while reading/writing to this attribute
>>    file.
> 
> Yes.
> 
>> 9. wbt_lat_usec:
>>    Write to this attribute file updates the various wbt limits and state. This may race 
>>    with blk_mq_exit_sched() or blk_register_queue(). The wbt updates through the 
>>    blk_mq_exit_sched() and blk_register_queue() is currently protected with q->sysfs_lock
>>    and so yes, we need to protect this attribute with q->sysfs_lock.
>>
>>    However, as mentioned above, if we're thinking of protecting elevator change/update
>>    using q->sets->tag_list_lock then we may use q->tag_set->tag_list_lock intstead of
>>    q->sysfs_lock while reading/writing to this attribute file.
> 
> Yes.
> 
>> The rest of attributes seems don't require protection from q->sysfs_lock or any other lock and 
>> those could be well protected with the sysfs/kernfs internal lock.
> 
> So I think the first step here is to remove the locking from
> queue_attr_store/queue_attr_show and move it into the attributes that
> still need it in some form, followed by replacing it with the more
> suitable locks as defined above.  And maybe with a little bit more
> work we can remove sysfs_lock entirely eventually.
Yes this sounds like a good plan! 

Thank you Christoph for your review! And as you suggested, in another thread, 
I'd wait some more time for Jens and Ming, if in case they have any further 
comments about this change.

Thanks,
--Nilay





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux