Showing my ignorance - kernel workers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



All,
I apologize in advance if you can point me to something I can read about mdraid besides the source code.  I'm beyond the bounds of my understanding of Linux.   Background, I do a bunch of NUMA aware computing.   I have two systems configured identically with a NUMA node 0 focused RAID5 LUN containing NUMA node 0 nvme drives  and a NUMA node 1 focused RAID5 LUN identically configured.  9+1 nvme, 128KB stripe, xfs sitting on top, 64KB O_DIRECT reads from the application.

On one system, the kernel worker for each of the two MD's matches the NUMA node where the drives are located, yet on a second system, they both sit on NUMA node 0.   I'm speculating that I could get more consistent performance of the identical LUNs if I could tie the kernel worker to the proper NUMA domain.   Is my speculation accurate, if so, how might I go about this or is this a feature request???

Both systems are running the same kernel on top of RHEL8.
uname -r
5.15.13-1.el8.elrepo.x86_64

System 1:

# ps -eo pid,tid,class,rtprio,ni,pri,numa,psr,pcpu,stat,wchan,comm | head -1; ps -eo pid,tid,class,rtprio,ni,pri,numa,psr,pcpu,stat,wchan,comm  | egrep 'md|raid' | grep -v systemd | grep -v mlx
    PID     TID CLS RTPRIO  NI PRI NUMA PSR %CPU STAT WCHAN  COMMAND
   1559    1559 TS       -   5  14    1 244  0.0 SN   -      ksmd
   1627    1627 TS       - -20  39    1 196  0.0 I<   -      md
   3734    3734 TS       - -20  39    1 110  0.0 I<   -      raid5wq
   3752    3752 TS       -   0  19    0  22 10.5 S    -      md0_raid5
   3753    3753 TS       -   0  19    1 208 11.4 S    -      md1_raid5
   3838    3838 TS       -   0  19    0  57  0.0 Ss   -      lsmd

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.65    0.00    5.43    0.28    0.00   93.63

Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
md0           1263604.00    0.00 62411724.00      0.00     0.00     0.00   0.00   0.00    1.94    0.00 2451.89    49.39     0.00   0.00 100.00
md1           1116529.00    0.00 55157228.00      0.00     0.00     0.00   0.00   0.00    2.45    0.00 2733.76    49.40     0.00   0.00 100.

System 2:

ps -eo pid,tid,class,rtprio,ni,pri,numa,psr,pcpu,stat,wchan,comm | head -1; ps -eo pid,tid,class,rtprio,ni,pri,numa,psr,pcpu,stat,wchan,comm  | egrep 'md|raid' | grep -v systemd | grep -v mlx
    PID     TID CLS RTPRIO  NI PRI NUMA PSR %CPU STAT WCHAN  COMMAND
   1492    1492 TS       -   5  14    1 200  0.0 SN   -      ksmd
   1560    1560 TS       - -20  39    1 200  0.0 I<   -      md
   3810    3810 TS       - -20  39    0 137  0.0 I<   -      raid5wq
   3811    3811 TS       -   0  19    0 148  0.0 S    -      md0_raid5
   3824    3824 TS       -   0  19    0 167  0.0 S    -      md1_raid5
   3929    3929 TS       -   0  19    1 115  0.0 Ss   -      lsmd

vg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.58    0.00    5.61    0.29    0.00   93.51

Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
md0           1118252.00    0.00 55171048.00      0.00     0.00     0.00   0.00   0.00    1.79    0.00 2002.27    49.34     0.00   0.00 100.00
md1           1262715.00    0.00 62342424.00      0.00     0.00     0.00   0.00   0.00    0.61    0.00 769.19    49.37     0.00   0.00 100.00


Jim Finlayson
U.S. Department of Defense






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux