Hello Legion,
On 3/3/2025 6:30 PM, Alexey Gladkov wrote:
On Tue, Feb 25, 2025 at 12:57:37PM +0100, Oleg Nesterov wrote:
On 02/24, Oleg Nesterov wrote:
Just in case, did you use
https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git/tree/src/hackbench/hackbench.c
?
Or did you use another version?
Exactly what parameters did you use?
If possible, please reproduce the hang again. How many threads/processes
sleeping in pipe_read() or pipe_write() do you see? (you can look at
/proc/$pid/stack).
Please pick one sleeping writer, and do
$ strace -p pidof_that_write
this should wake this writer up. If a missed wakeup is the only problem,
hackbench should continue.
The more info you can provide the better ;)
I was also able to reproduce the hackbench hang with the parameters
mentioned earlier (threads and processes) on the kernel from master.
Thank you for reporting your observations!
If you are able to reproduce it reliably, could you please give the
below diff posted by Swapnil from the parallel thread [1] a try:
diff --git a/fs/pipe.c b/fs/pipe.c
index ce1af7592780..a1931c817822 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -417,9 +417,19 @@ static inline int is_packetized(struct file *file)
/* Done while waiting without holding the pipe lock - thus the READ_ONCE() */
static inline bool pipe_writable(const struct pipe_inode_info *pipe)
{
- unsigned int head = READ_ONCE(pipe->head);
- unsigned int tail = READ_ONCE(pipe->tail);
unsigned int max_usage = READ_ONCE(pipe->max_usage);
+ unsigned int head, tail;
+
+ tail = READ_ONCE(pipe->tail);
+ /*
+ * Since the unsigned arithmetic in this lockless preemptible context
+ * relies on the fact that the tail can never be ahead of head, read
+ * the head after the tail to ensure we've not missed any updates to
+ * the head. Reordering the reads can cause wraparounds and give the
+ * illusion that the pipe is full.
+ */
+ smp_rmb();
+ head = READ_ONCE(pipe->head);
return !pipe_full(head, tail, max_usage) ||
!READ_ONCE(pipe->readers);
---
We've been running hackbench for a while now with the above diff and we
haven't run into a hang yet. Sorry for the troubles and thank you again.
[1] https://lore.kernel.org/all/03a1f4af-47e0-459d-b2bf-9f65536fc2ab@xxxxxxx/
--
Thanks and Regards,
Prateek