Re: [PATCH 1/2] pipe: change pipe_write() to never add a zero-sized buffer

K Prateek Nayak <kprateek.nayak@xxxxxxx> · Tue, 11 Feb 2025 09:29:02 +0530

Hello Oleg,

On 2/10/2025 10:52 PM, Oleg Nesterov wrote:
Hi Prateek,

On 02/10, K Prateek Nayak wrote:

  1-groups     1.00 [ -0.00]( 7.19)                0.95 [  4.90](12.39)
  2-groups     1.00 [ -0.00]( 3.54)                1.02 [ -1.92]( 6.55)
  4-groups     1.00 [ -0.00]( 2.78)                1.01 [ -0.85]( 2.18)
  8-groups     1.00 [ -0.00]( 1.04)                0.99 [  0.63]( 0.77)
16-groups     1.00 [ -0.00]( 1.02)                1.00 [ -0.26]( 0.98)

I don't see any regression / improvements from a performance standpoint

Yes, this patch shouldn't make any difference performance-wise, at least
in this case. Although I was thinking the same thing when I sent "pipe_read:
don't wake up the writer if the pipe is still full" ;)

Tested-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>

Thanks! Please see v2, I've included you tag.

Thank you. I can confirm it is same as the variant I tested.


Any chance you can also test the patch below?

To me it looks like a cleanup which makes the "merge small writes" logic
more understandable. And note that "page-align the rest of the writes"
doesn't work anyway if "total_len & (PAGE_SIZE-1)" can't fit in the last
buffer.

However, in this particular case with DATASIZE=100 this patch can increase
the number of copy_page_from_iter()'s in pipe_write(). And with this change
receiver() can certainly get the short reads, so this can increase the
number of sys_read() calls.

So I am just curious if this change can cause any noticeable regression on
your machine.

For the sake of science:

==================================================================
Test          : sched-messaging
Units         : Normalized time in seconds
Interpretation: Lower is better
Statistic     : AMean
==================================================================
Case:         baseline[pct imp](CV)  merge_writes[pct imp](CV)
 1-groups     1.00 [ -0.00](12.39)     1.08 [ -7.62](11.73)
 2-groups     1.00 [ -0.00]( 6.55)     0.97 [  2.52]( 3.01)
 4-groups     1.00 [ -0.00]( 2.18)     1.00 [  0.42]( 1.97)
 8-groups     1.00 [ -0.00]( 0.77)     1.03 [ -3.35]( 5.07)
16-groups     1.00 [ -0.00]( 0.98)     1.01 [ -1.37]( 2.20)

I see some improvements up until 4 groups (160 tasks) but beyond that it
goes into a slight regression territory but the variance is large to
draw any conclusions.

Science experiment concluded.


Thank you!

Oleg.

--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -459,16 +459,16 @@ anon_pipe_write(struct kiocb *iocb, struct iov_iter *from)
  	was_empty = pipe_empty(head, pipe->tail);
  	chars = total_len & (PAGE_SIZE-1);
  	if (chars && !was_empty) {
-		unsigned int mask = pipe->ring_size - 1;
-		struct pipe_buffer *buf = &pipe->bufs[(head - 1) & mask];
+		struct pipe_buffer *buf = pipe_buf(pipe, head - 1);
  		int offset = buf->offset + buf->len;
+		int avail = PAGE_SIZE - offset;
  
-		if ((buf->flags & PIPE_BUF_FLAG_CAN_MERGE) &&
-		    offset + chars <= PAGE_SIZE) {
+		if (avail && (buf->flags & PIPE_BUF_FLAG_CAN_MERGE)) {
  			ret = pipe_buf_confirm(pipe, buf);
  			if (ret)
  				goto out;
  
+			chars = min_t(ssize_t, chars, avail);
  			ret = copy_page_from_iter(buf->page, offset, chars, from);
  			if (unlikely(ret < chars)) {
  				ret = -EFAULT;


--
Thanks and Regards,
Prateek