On Mon, Jan 16, 2023 at 11:16 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Mon, 16 Jan 2023 21:10:37 +0000 Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > On Mon, Jan 16, 2023 at 04:38:01AM +0000, Matthew Wilcox wrote: > > > On Mon, Jan 16, 2023 at 11:16:13AM +0800, maobibo wrote: > > > > Hongchen, > > > > > > > > I have a glance with this patch, it simply replaces with > > > > spinlock_irqsave with mutex lock. There may be performance > > > > improvement with two processes competing with pipe, however > > > > for N processes, there will be complex context switches > > > > and ipi interruptts. > > > > > > > > Can you find some cases with more than 2 processes competing > > > > pipe, rather than only unixbench? > > > > > > What real applications have pipes with more than 1 writer & 1 reader? > > > I'm OK with slowing down the weird cases if the common cases go faster. > > > > >From commit 0ddad21d3e99c743a3aa473121dc5561679e26bb: > > While this isn't a common occurrence in the traditional "use a pipe as a > > data transport" case, where you typically only have a single reader and > > a single writer process, there is one common special case: using a pipe > > as a source of "locking tokens" rather than for data communication. > > > > In particular, the GNU make jobserver code ends up using a pipe as a way > > to limit parallelism, where each job consumes a token by reading a byte > > from the jobserver pipe, and releases the token by writing a byte back > > to the pipe. > > The author has tested this patch with Linus's test code from 0ddad21d3e > and the results were OK > (https://lkml.kernel.org/r/c3cbede6-f19e-3333-ba0f-d3f005e5d599@xxxxxxxxxxx). > Yesterday, I had some time to play with and without this patch on my Debian/unstable AMD64 box. [ TEST-CASE ] BASE: Linux v6.2-rc4 PATCH: [PATCH v3] pipe: use __pipe_{lock,unlock} instead of spinlock TEST-CASE: Taken from commit 0ddad21d3e99 RUN: gcc-12 -o 0ddad21d3e99 0ddad21d3e99.c Link: https://lore.kernel.org/all/20230107012324.30698-1-zhanghongchen@xxxxxxxxxxx/ Link: https://git.kernel.org/linus/0ddad21d3e99 [ INSTRUCTIONS ] echo 0 | sudo tee /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid 10 runs: /usr/bin/perf stat --repeat=10 ./0ddad21d3e99 echo 1 | sudo tee /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid [ BEFORE ] Performance counter stats for './0ddad21d3e99' (10 runs): 23.985,50 msec task-clock # 3,246 CPUs utilized ( +- 0,20% ) 1.112.822 context-switches # 46,696 K/sec ( +- 0,30% ) 403.033 cpu-migrations # 16,912 K/sec ( +- 0,28% ) 1.508 page-faults # 63,278 /sec ( +- 2,95% ) 39.436.000.959 cycles # 1,655 GHz ( +- 0,22% ) 29.364.329.413 stalled-cycles-frontend # 74,91% frontend cycles idle ( +- 0,24% ) 22.139.448.400 stalled-cycles-backend # 56,48% backend cycles idle ( +- 0,23% ) 18.565.538.523 instructions # 0,47 insn per cycle # 1,57 stalled cycles per insn ( +- 0,17% ) 4.059.885.546 branches # 170,359 M/sec ( +- 0,17% ) 59.991.226 branch-misses # 1,48% of all branches ( +- 0,19% ) 7,3892 +- 0,0127 seconds time elapsed ( +- 0,17% ) [ AFTER ] Performance counter stats for './0ddad21d3e99' (10 runs): 24.175,94 msec task-clock # 3,362 CPUs utilized ( +- 0,11% ) 1.139.152 context-switches # 47,119 K/sec ( +- 0,12% ) 407.994 cpu-migrations # 16,876 K/sec ( +- 0,26% ) 1.555 page-faults # 64,319 /sec ( +- 3,11% ) 40.904.849.091 cycles # 1,692 GHz ( +- 0,13% ) 30.587.623.034 stalled-cycles-frontend # 74,84% frontend cycles idle ( +- 0,15% ) 23.145.533.537 stalled-cycles-backend # 56,63% backend cycles idle ( +- 0,16% ) 18.762.964.037 instructions # 0,46 insn per cycle # 1,63 stalled cycles per insn ( +- 0,11% ) 4.057.182.849 branches # 167,817 M/sec ( +- 0,09% ) 63.887.806 branch-misses # 1,58% of all branches ( +- 0,25% ) 7,19157 +- 0,00644 seconds time elapsed ( +- 0,09% ) [ RESULT ] seconds time elapsed: - 2,67% The test-case c-file is attached and for the case the above lines were truncated I have attached as a README file. Feel free to add a... Tested-by: Sedat Dilek <sedat.dilek@xxxxxxxxx> # LLVM v15.0.3 (x86-64) If you need further information, please let me know. -Sedat- > I've been stalling on this patch until Linus gets back to his desk, > which now appears to have happened. > > Hongchen, when convenient, please capture this discussion (as well as > the testing results with Linus's sample code) in the changelog and send > us a v4, with Linus on cc? >
// Test-case: Benchmark pipe performance // Author: Linus Torvalds // Link: https://git.kernel.org/linus/0ddad21d3e99 // // Compile: gcc-12 -o 0ddad21d3e99 0ddad21d3e99.c // #include <unistd.h> int main(int argc, char **argv) { int fd[2], counters[2]; pipe(fd); counters[0] = 0; counters[1] = -1; write(fd[1], counters, sizeof(counters)); /* 64 processes */ fork(); fork(); fork(); fork(); fork(); fork(); do { int i; read(fd[0], &i, sizeof(i)); if (i < 0) continue; counters[0] = i+1; write(fd[1], counters, (1+(i & 1)) *sizeof(int)); } while (counters[0] < 1000000); return 0; }
Attachment:
README_zhanghongchen-pipe-v3-0ddad21d3e99
Description: Binary data