Re: [PATCH] xfs: using mutex instead of semaphore for xfs_buf_lock()

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 16 Jan 2025 07:54:25 +1100

On Wed, Jan 15, 2025 at 08:05:21PM +0800, Jinliang Zheng wrote:
> On Wed, 15 Jan 2025 11:28:54 +1100, Dave Chinner wrote:
> > On Fri, Dec 20, 2024 at 01:16:29AM +0800, Jinliang Zheng wrote:
> > > xfs_buf uses a semaphore for mutual exclusion, and its count value
> > > is initialized to 1, which is equivalent to a mutex.
> > > 
> > > However, mutex->owner can provide more information when analyzing
> > > vmcore, making it easier for us to identify which task currently
> > > holds the lock.
> > 
> > However, the buffer lock also protects the buffer state and contents
> > whilst IO id being performed and it *is not owned by any task*.
> > 
> > A single lock cycle for a buffer can pass through multiple tasks
> > before being unlocked in a different task to that which locked it:
> > 
> > p0			<intr>			<kworker>
> > xfs_buf_lock()
> > ...
> > <submitted for async io>
> > <wait for IO completion>
> > 		.....
> > 			<io completion>
> > 			queued to workqueue
> > 		.....
> > 						perform IO completion
> > 						xfs_buf_unlock()
> > 
> > 
> > IOWs, the buffer lock here prevents any other task from accessing
> > and modifying the contents/state of the buffer until the IO in
> > flight is completed. i.e. the buffer contents are guaranteed to be
> > stable during write IO, and unreadable when uninitialised during
> > read IO....
> 
> Yes.
> 
> > 
> > i.e. the locking model used by xfs_buf objects is incompatible with
> > the single-owner-task critical section model implemented by
> > mutexes...
> > 
> 
> Yes, from a model perspective.
> 
> This patch is proposed for two reasons:
> 1. The maximum count of the xfs_buf->b_sema is 1, which means that only one
>    kernel code path can hold it at the same time. From this perspective,
>    changing it to mutex will not have any functional impact.
> 2. When troubleshooting the hungtask of xfs, sometimes it is necessary to
>    locate who has acquired the lock. Although, as you said, xfs_buf->b_sema
>    will flow to other kernel code paths after being down(), it is also helpful
>    to know which kernel code path locked it first.
> 
> Haha, that's just my thought. If you think there is really no need to know who
> called the down() on xfs_buf->b_sema, please just ignore this patch.

We are rejecting the patch because it's fundamentally broken, not
because we don't want debugging visibility.

If you want to track what task locked a semaphore, then that should
be added to the semaphore implementation. Changing the XFS locking
implementation is not the solution to the problem you are trying to
solve.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx