On Tue, 11 Mar 2025 14:21:37 +0100 Bartosz Golaszewski <brgl@xxxxxxxx> wrote: > On Tue, Mar 11, 2025 at 1:30 PM David Jander <david@xxxxxxxxxxx> wrote: > > > > On Tue, 11 Mar 2025 12:45:51 +0100 > > Bartosz Golaszewski <brgl@xxxxxxxx> wrote: > > > > > On Tue, Mar 11, 2025 at 11:01 AM David Jander <david@xxxxxxxxxxx> wrote: > > > > > > > > On kernel 6.13, after git revert -n fcc8b637c542 time is back to what it was > > > > on 6.12. > > > > > > > > > > Interestingly: I cannot reproduce it. Obviously gpiofind doesn't exist > > > in libgpiod v2 but I'm running gpiodetect with and without reverting > > > these changes and am getting roughly the same results: ~0.050s real > > > time for 1 up to 4 chips. > > > > > > Any idea why that could be? Can you reproduce it with libgpiod v2 (I > > > don't know why that wouldn't be the case but worth double checking). > > > > > > Can you describe your platform? Is it a multi-core or single-core CPU? What > > RCU implementation does it use? Tree or tiny? If it is multi-core, is there a > > difference if you disable all but one core? > > Maybe some kernel CONFIG option that makes a difference? I am not an expert in > > RCU (in fact I barely know what it does), so maybe I am missing something that > > makes this problem go away? > > > > I'm testing on a qemu VM - SMP and single core. RCU algo is tree. I haven't followed deep into the RCU code, but I had the impression that the synchronize_rcu code sets a completion and waits for it, so I suspect the bulk of the delay coming from context switches. Maybe on qemu this behaves differently than on real hardware in terms of timing and overhead? Btw, there is also another way this can be made visible: $ strace -r gpiodetect 0.000000 execve("/usr/bin/gpiodetect", ["gpiodetect"], 0x7eec5d94 /* 13 vars */) = 0 0.017348 brk(NULL) = 0xc21000 0.000689 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) ... 0.000677 openat(AT_FDCWD, "/dev/gpiochip9", O_RDWR|O_CLOEXEC) = 18 0.000741 statx(AT_FDCWD, "/dev/gpiochip9", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFCHR|0600, stx_size=0, ...}) = 0 0.000794 statx(AT_FDCWD, "/dev/gpiochip9", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFCHR|0600, stx_size=0, ...}) = 0 0.000794 access("/sys/bus/gpio/devices/gpiochip9/dev", R_OK) = 0 0.000738 openat(AT_FDCWD, "/sys/bus/gpio/devices/gpiochip9/dev", O_RDONLY) = 19 0.000754 read(19, "254:9\n", 15) = 6 0.000597 close(19) = 0 0.000569 ioctl(18, GPIO_GET_CHIPINFO_IOCTL, {name="gpiochip9", label="GPIOJ", lines=16}) = 0 0.000631 statx(1, "", AT_STATX_SYNC_AS_STAT|AT_NO_AUTOMOUNT|AT_EMPTY_PATH, STATX_BASIC_STATS, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFCHR|0600, stx_size=0, ...}) = 0 0.000778 ioctl(1, TCGETS, {c_iflag=ICRNL|IXON|IXOFF|IUTF8, c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B1500000|CS8|CREAD|HUPCL|CLOCAL, c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|ECHOKE, ...}) = 0 0.001941 write(1, "gpiochip0 [GPIOA] (16 lines)\n", 29gpiochip0 [GPIOA] (16 lines) ) = 29 0.000670 close(3) = 0 0.006162 write(1, "gpiochip1 [GPIOB] (16 lines)\n", 29gpiochip1 [GPIOB] (16 lines) ) = 29 0.000655 close(4) = 0 0.006250 write(1, "gpiochip10 [GPIOK] (8 lines)\n", 29gpiochip10 [GPIOK] (8 lines) ) = 29 0.000667 close(5) = 0 0.006338 write(1, "gpiochip11 [GPIOZ] (8 lines)\n", 29gpiochip11 [GPIOZ] (8 lines) ) = 29 0.000661 close(6) = 0 0.006321 write(1, "gpiochip12 [unknown] (22 lines)\n", 32gpiochip12 [unknown] (22 lines) ) = 32 0.000662 close(7) = 0 0.006365 write(1, "gpiochip13 [mcp23s17.0] (16 line"..., 35gpiochip13 [mcp23s17.0] (16 lines) ) = 35 0.000672 close(8) = 0 0.006389 write(1, "gpiochip14 [0-0020] (16 lines)\n", 31gpiochip14 [0-0020] (16 lines) ) = 31 0.000641 close(9) = 0 0.006268 write(1, "gpiochip15 [0-0021] (16 lines)\n", 31gpiochip15 [0-0021] (16 lines) ) = 31 0.000677 close(10) = 0 0.006330 write(1, "gpiochip2 [GPIOC] (16 lines)\n", 29gpiochip2 [GPIOC] (16 lines) ) = 29 0.000648 close(11) = 0 0.006358 write(1, "gpiochip3 [GPIOD] (16 lines)\n", 29gpiochip3 [GPIOD] (16 lines) ) = 29 0.000655 close(12) = 0 0.006333 write(1, "gpiochip4 [GPIOE] (16 lines)\n", 29gpiochip4 [GPIOE] (16 lines) ) = 29 0.000669 close(13) = 0 0.006332 write(1, "gpiochip5 [GPIOF] (16 lines)\n", 29gpiochip5 [GPIOF] (16 lines) ) = 29 0.000653 close(14) = 0 0.006359 write(1, "gpiochip6 [GPIOG] (16 lines)\n", 29gpiochip6 [GPIOG] (16 lines) ) = 29 0.000653 close(15) = 0 0.006332 write(1, "gpiochip7 [GPIOH] (16 lines)\n", 29gpiochip7 [GPIOH] (16 lines) ) = 29 0.000657 close(16) = 0 0.006344 write(1, "gpiochip8 [GPIOI] (16 lines)\n", 29gpiochip8 [GPIOI] (16 lines) ) = 29 0.000661 close(17) = 0 0.006343 write(1, "gpiochip9 [GPIOJ] (16 lines)\n", 29gpiochip9 [GPIOJ] (16 lines) ) = 29 0.000654 close(18) = 0 0.006394 exit_group(0) = ? 0.001335 +++ exited with 0 +++ I hope this is readable. Notice the long relative time-stamp of the syscall immediately preceding each of the close() calls that close a gpiochip chardev. For comparison, here's the output after applying your patch: ... 0.000474 close(3) = 0 0.000472 write(1, "gpiochip1 [GPIOB] (16 lines)\n", 29gpiochip1 [GPIOB] (16 lines) ) = 29 0.000426 close(4) = 0 0.000406 write(1, "gpiochip10 [GPIOK] (8 lines)\n", 29gpiochip10 [GPIOK] (8 lines) ) = 29 0.000418 close(5) = 0 0.000354 write(1, "gpiochip11 [GPIOZ] (8 lines)\n", 29gpiochip11 [GPIOZ] (8 lines) ) = 29 0.000454 close(6) = 0 0.000359 write(1, "gpiochip12 [unknown] (22 lines)\n", 32gpiochip12 [unknown] (22 lines) ) = 32 0.000457 close(7) = 0 0.000433 write(1, "gpiochip13 [mcp23s17.0] (16 line"..., 35gpiochip13 [mcp23s17.0] (16 lines) ) = 35 0.000412 close(8) = 0 0.000422 write(1, "gpiochip14 [0-0020] (16 lines)\n", 31gpiochip14 [0-0020] (16 lines) ) = 31 0.000416 close(9) = 0 0.000416 write(1, "gpiochip15 [0-0021] (16 lines)\n", 31gpiochip15 [0-0021] (16 lines) ) = 31 0.000408 close(10) = 0 0.000358 write(1, "gpiochip2 [GPIOC] (16 lines)\n", 29gpiochip2 [GPIOC] (16 lines) ) = 29 0.000441 close(11) = 0 0.000359 write(1, "gpiochip3 [GPIOD] (16 lines)\n", 29gpiochip3 [GPIOD] (16 lines) ) = 29 0.000443 close(12) = 0 0.000359 write(1, "gpiochip4 [GPIOE] (16 lines)\n", 29gpiochip4 [GPIOE] (16 lines) ) = 29 0.000394 close(13) = 0 0.000413 write(1, "gpiochip5 [GPIOF] (16 lines)\n", 29gpiochip5 [GPIOF] (16 lines) ) = 29 0.000396 close(14) = 0 0.000479 write(1, "gpiochip6 [GPIOG] (16 lines)\n", 29gpiochip6 [GPIOG] (16 lines) ) = 29 0.000431 close(15) = 0 0.000431 write(1, "gpiochip7 [GPIOH] (16 lines)\n", 29gpiochip7 [GPIOH] (16 lines) ) = 29 0.000471 close(16) = 0 0.000381 write(1, "gpiochip8 [GPIOI] (16 lines)\n", 29gpiochip8 [GPIOI] (16 lines) ) = 29 0.000484 close(17) = 0 0.000374 write(1, "gpiochip9 [GPIOJ] (16 lines)\n", 29gpiochip9 [GPIOJ] (16 lines) ) = 29 0.000480 close(18) = 0 0.000418 exit_group(0) = ? 0.001373 +++ exited with 0 +++ > In > any case: I've just sent you an RFT patch that switches to using the > raw notifier. Could you see what results you're getting with it? Thanks! This was quick :-) Like I also replied to the patch email (kernel 6.13 and libgpiod 1.6.3): $ time gpiofind LPOUT0 gpiochip7 9 real 0m 0.02s user 0m 0.00s sys 0m 0.01s $ time gpiodetect gpiochip0 [GPIOA] (16 lines) gpiochip1 [GPIOB] (16 lines) gpiochip10 [GPIOK] (8 lines) gpiochip11 [GPIOZ] (8 lines) gpiochip12 [unknown] (22 lines) gpiochip13 [mcp23s17.0] (16 lines) gpiochip14 [0-0020] (16 lines) gpiochip15 [0-0021] (16 lines) gpiochip2 [GPIOC] (16 lines) gpiochip3 [GPIOD] (16 lines) gpiochip4 [GPIOE] (16 lines) gpiochip5 [GPIOF] (16 lines) gpiochip6 [GPIOG] (16 lines) gpiochip7 [GPIOH] (16 lines) gpiochip8 [GPIOI] (16 lines) gpiochip9 [GPIOJ] (16 lines) real 0m 0.03s user 0m 0.00s sys 0m 0.01s Best regards, -- David Jander