On 2/19/25 9:59 PM, Masami Hiramatsu (Google) wrote:
On Wed, 19 Feb 2025 21:15:08 -0500
Waiman Long <llong@xxxxxxxxxx> wrote:
On 2/19/25 8:41 PM, Steven Rostedt wrote:
On Wed, 19 Feb 2025 20:36:13 -0500
Waiman Long <llong@xxxxxxxxxx> wrote:
this field, we don't need to take lock, though taking the wait_lock may
still be needed to examine other information inside the mutex.
Do we need to take it just for accessing owner, which is in an atomic?
Right. I forgot it is an atomic_long_t. In that case, no lock should be
needed.
Now if we have a two fields to read:
block_flags (for the type of lock) and blocked_on (for the lock)
We need a way to synchronize the two. What happens if we read the type, and
the task wakes up and and then blocks on a different type of lock?
Then the lock read from blocked_on could be a different type of lock than
what is expected.
That is different from reading the owner. In this case, we need to use
smp_rmb()/wmb() to sequence the read and write operations unless it is
guaranteed that they are in the same cacheline. One possible way is as
follows:
Writer - setting them:
WRITE_ONCE(lock)
smp_wmb()
WRITE_ONCE(type)
Clearing them:
WRITE_ONCE(type, 0)
smp_wmb()
WRITE_ONCE(lock, NULL)
Reader:
READ_ONCE(type)
again:
smp_rmb()
READ_ONCE(lock)
smp_rmb()
if (READ_ONCE(type) != type)
goto again
What about mutex-rwsem-mutex case?
mutex_lock(&lock1);
down_read(&lock2);
mutex_lock(&lock3);
The worst scenario is;
WRITE_ONCE(lock, &lock1)
smp_wmb()
WRITE_ONCE(type, MUTEX) READ_ONCE(type) -> MUTEX
WRITE_ONCE(type, 0)
smp_wmb()
WRITE_ONCE(lock, NULL)
WRITE_ONCE(lock, &lock2) READ_ONCE(lock) -> &lock2
smp_wmb()
WRITE_ONCE(type, RWSEM)
WRITE_ONCE(type, 0)
smp_wmb()
WRITE_ONCE(lock, NULL)
WRITE_ONCE(lock, &lock3)
smp_wmb()
WRITE_ONCE(type, MUTEX) READ_ONCE(type) -> MUTEX == MUTEX
WRITE_ONCE(type, 0)
smp_wmb()
WRITE_ONCE(lock, NULL)
"OK, lock2 is a MUTEX!"
So unless stopping the blocker task, we can not ensure this works.
But unless decode the lock, we don't know the blocker task.
That could only happen if the reader can get interrupted/preempted for a
long time. In that case, we may need to reread the lock again to be sure
that they are stable.
Cheers,
Longman
Maybe we can run the hung_task in stop_machine()?
(or introduce common_lock)
Thank you,