Dear NFS fellows, During HPC workloads on we notice that Linux NFS4.2/pNFS client menonstraits unexpected low performance. The application opens 55 files parallel reads the data with multiple threads. The server issues flexfile layout with tighly coupled NFSv4.1 DSes. Oservations: - despite 1MB rsize/wsize returned by layout, client never issues reads bigger that 512k (offten much smaller) - client always uses slot 0 on DS, and - reads happen sequentialy, i.e. only one in-flight READ requests - following reads often just read the next 512k block - If instead of parallel application a simple dd is called, that multiple slots and 1MB READs are sent $ dd if=/pnfs/xxxx/00054.h5 of=/dev/null 45753381+1 records in 45753381+1 records out 23425731171 bytes (23 GB, 22 GiB) copied, 69.702 s, 336 MB/s The client has 80 cores on 2 sockets, 512BG of RAM and runs REHL 9.4 $ uname -r 5.14.0-427.26.1.el9_4.x86_64 $ free -g total used free shared buff/cache available Mem: 503 84 392 0 29 419 $ lscpu | head Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 80 On-line CPU(s) list: 0-79 Vendor ID: GenuineIntel BIOS Vendor ID: Intel(R) Corporation Model name: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz BIOS Model name: Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz The client and all DSes equiped with 10GB/s NICs. Any ideas where to look? Best regards, Tigran.
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature