Hi everybody!!! I am very happy writing my first email to one of the Linux mailing list. I have read the faq and i know this mailing list is not a user help desk but i have strange behaviour with memory write back and NFS. Maybe someone can help me. I am so sorry if this is not the right "forum". I did three simple tests writing to the same NFS filesystem and the behavior of the cpu and memory is extruding my brain. The Environment: - Linux RedHat 8.6, 2 vCPU (VMWare VM) and 8 GB RAM (but same behavior with Red Hat 7.9) - One nfs filesystem mounted with sync and without sync 1x.1x.2xx.1xx:/test_fs on /mnt/test_fs_with_sync type nfs (rw,relatime,sync,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=1x.1x.2xx.1xx,mountvers=3,mountport=2050,mountproto=udp,local_lock=none,addr=1x.1x.2xx.1xx) 1x.1x.2xx.1xx:/test_fs on /mnt/test_fs_without_sync type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=1x.1x.2xx.1xx,mountvers=3,mountport=2050,mountproto=udp,local_lock=none,addr=1x.1x.2xx.1xx:) - Link between nfs client and nfs server is a 10Gb (Fiber) and iperf3 data show the link works at maximum speed. No problems here. I know there are nfs options like nconnect to improve performance but I am interested in linux kernel internals. The test: 1.- dd in /mnt/test_fs_without_sync dd if=/dev/zero of=test.out bs=1M count=5000 5000+0 records in 5000+0 records out 5242880000 bytes (5.2 GB, 4.9 GiB) copied, 21.4122 s, 245 MB/s * High cpuwait * High nfs latency * Writeback in use Evidences: https://zerobin.net/?43f9bea1953ed7aa#TaUk+K0GDhxjPq1EgJ2aAHgEyhntQ0NQzeFF51d9qI0= https://i.stack.imgur.com/pTong.png 2.- dd in /mnt/test_fs_with_sync dd if=/dev/zero of=test.out bs=1M count=5000 5000+0 records in 5000+0 records out 5242880000 bytes (5.2 GB, 4.9 GiB) copied, 35.6462 s, 147 MB/s * High cpuwait * Low nfs latency * No writeback Evidences https://zerobin.net/?0ce52c5c5d946d7a#ZeyjHFIp7B+K+65DX2RzEGlp+Oq9rCidAKL8RpKpDJ8= https://i.stack.imgur.com/Pf1xS.png 3.- dd in /mnt/test_fs_with_sync and oflag=direct dd if=/dev/zero of=test.out bs=1M oflag=direct count=5000 5000+0 records in 5000+0 records out 5242880000 bytes (5.2 GB, 4.9 GiB) copied, 34.6491 s, 151 MB/s * Low cpuwait * Low nfs latency * No writeback Evidences: https://zerobin.net/?03c4aa040a7a5323#bScEK36+Sdcz18VwKnBXNbOsi/qFt/O+qFyNj5FUs8k= https://i.stack.imgur.com/Qs6y5.png The questions: I know write back is an old issue in linux and seems is the problem here.I played with vm.dirty_background_bytes/vm.dirty_background_ratio and vm.dirty_background_ratio/vm.dirty_background_ratio (i know only one is valid) but whatever value put in this tunables I always have iowait (except from dd with oflag=direct) - In test number 2. How is it possible that it has no nfs latency but has a high cpu wait? - In test number 2. How is it possible that have almost the same code path than test number 1? Test number 2 use a nfs filesystem mounted with sync option but seems to use pagecache codepath (see flame graph) - In test number 1. Why isn't there a change in cpuwait behavior when vm.dirty tunables are changed? (i have tested a lot of combinations) Thank you very much!! Best regards.