On 2023-01-18 17:31:30 [+0000], Mel Gorman wrote: > If we drop that "we prefer the RT reader" then it would block on the > > RTmutex. It will _still_ be preferred over the writer because it will be > > enqueued before the writer in the queue due to its RT priority. The only > > downside is that it has to wait until all readers are left. > > The writer has to wait until all the readers have left anyway. I meant the READER in case it has RT priority. It will enqueue itself on the RTmutex, first in line, and wait until all other READER leave. > If I understand you correctly, the patch becomes this; exactly. > --8<-- … > This patch records a timestamp when the first writer is blocked. DT / s/DT/DL > RT tasks can continue to take the lock for read as long as readers exist > indefinitely. Other readers can acquire the read lock unless a writer > has been blocked for a minimum of 4ms. This is sufficient to allow the > dio_truncate test case to complete within the 30 minutes timeout. > > Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> > --- … > diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c > index c201aadb9301..84c5e4e4d25b 100644 > --- a/kernel/locking/rwbase_rt.c > +++ b/kernel/locking/rwbase_rt.c > @@ -74,9 +106,11 @@ static int __sched __rwbase_read_lock(struct rwbase_rt *rwb, > raw_spin_lock_irq(&rtm->wait_lock); > /* > * Allow readers, as long as the writer has not completely > - * acquired the semaphore for write. > + * acquired the semaphore for write and reader bias is still > + * allowed. > */ > - if (atomic_read(&rwb->readers) != WRITER_BIAS) { > + if (atomic_read(&rwb->readers) != WRITER_BIAS && > + rwbase_allow_reader_bias(rwb)) { > atomic_inc(&rwb->readers); > raw_spin_unlock_irq(&rtm->wait_lock); > return 0; > @@ -264,12 +298,17 @@ static int __sched rwbase_write_lock(struct rwbase_rt *rwb, > if (__rwbase_write_trylock(rwb)) > break; > > + /* Record first new read/write contention. */ > + set_writer_blocked(rwb); > + > raw_spin_unlock_irqrestore(&rtm->wait_lock, flags); > rwbase_schedule(); > raw_spin_lock_irqsave(&rtm->wait_lock, flags); > > set_current_state(state); > } > + > + rwb->waiter_timeout = 0; Regarding memory ordering and ordering in general: - Should the writer leave from rwbase_schedule() due to a signal then set_writer_blocked() sets a timeout but it is not cleared on the signal leave. - There is only writer in that for loop within rwbase_write_lock() because only one writer can own the rtmutex at a time. (A second writer blocks on the RTmutex and needs to wait, I may have spread some confusion earler). Therefore it should be okay to unconditionally set the timeout (instead of checking for zero). - Once the writer removes READER_BIAS, it forces the reader into the slowpath. At that time the writer does not own the wait_lock meaning the reader _could_ check the timeout before writer had a chance to set it. The worst thing is probably that if jiffies does not have the highest bit set then it will always disable the reader bias here. The easiest thing is probably to check timeout vs 0 and ensure on the writer side that the lowest bit is always set (in the unlikely case it will end up as zero). > rwbase_restore_current_state(); > trace_contention_end(rwb, 0); Sebastian