I read and am still digesting the kernel tuning parameters mentioned in John's link. There's another useful link that expands on some of the same points here: The Linux Page Cache and pdflush: Theory of Operation and Tuning for Write-Heavy Loads <http://www.westnet.com/~gsmith/content/linux-pdflush.htm> However, while I digest them, I have a few more observations: It's not that the server is slow, it's the gluster native client that is. So I'm not sure that increasing the perf of the server will help much at this point. I wrote a tiny script (burp.pl) that just emits lots of short strings to stdout like the problem app that originated this discussion (and a colleague did the same with a C++ app) to verify. If I send stdout to my gluster fs via the native gluster client, I observe a steady stream of data at about 14MB/s (this is on a DDR/IPoIB cluster) $ time `./burp.pl 100 > /gl/hmangala/burp.out && sync` real 0m29.646s user 0m17.830s sys 0m2.000s In this case, burp.pl is only getting about 70% of a CPU and the gluster process is getting ~40%. Here'e the ifstat output for the IB channel (~1 entry/s). Note the continuous data out rate of about 14MB/s (and the odd input rate of about 1MB/s). ib1 KB/s in KB/s out 0.00 0.00 0.00 0.00 < burp starts 383.34 5200.51 1039.43 14243.11 1031.59 14132.11 1037.36 14223.32 1044.20 14304.81 1040.40 14288.45 1037.78 14217.64 1042.19 14306.66 1036.54 14200.05 1062.26 14699.87 1072.64 14711.29 1072.87 14694.52 1065.18 14608.67 1074.23 14711.32 1073.26 14711.43 1069.79 14672.60 1066.66 14608.58 1067.68 14647.14 1074.16 14711.48 1069.16 14651.39 1077.19 14767.32 1075.74 14736.75 1068.77 14634.86 1066.81 14625.90 1063.89 14586.81 1064.79 14608.46 1065.37 14583.04 1065.10 14604.44 1063.86 14591.14 388.41 5323.84 < burp ends 0.00 0.00 ------------------------------- 30460.65 417607.51 totals (30MB input vs 417MB output) for the NFS mounted channel; (mount command: mount -o mountproto=tcp,vers=3,noatime,auto -t nfs pbs1ib:/gli /mnt/glnfs $ time `./burp.pl 100 > /mnt/glnfs/hmangala/burp.out && sync` real 0m24.704s < a little faster user 0m20.710s sys 0m0.810s In this case burp.pl gets 100% of a CPU; gluster isn't involved and so doesn't register. Here'e the ifstat output for the IB channel: Note the complete lack of input and no data output until the very end when it bursts at ~140MB/s. ib1 KB/s in KB/s out 0.00 0.00 0.73 0.00 < burp starts 0.18 0.00 0.00 0.00 1.33 1.88 0.00 0.00 0.00 0.00 0.04 0.00 0.04 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.04 1.29 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 314.08 83002.70 517.10 142239.6 513.94 141469.4 123.96 33219.89 < burp ends 0.04 0.00 -------------------------------- 1471.44 399934.76 Totals (1.47MB input vs 400MB output) It's hard to argue with that. NFS is clearly superior / more efficient on a single process and may be more efficient overall for the use cases on our clusters. So why doesn't the gluster native client do client-side caching like NFS? It looks like it's explicitly refusing to be cached by the usual (and usually excellent) Linux mechanisms. What's the reason for declining this OS advantage on the client side while providing such a technically sweet solution on the server side? I'm at a loss to explain this behavior to our technical group. <previous deleted> -- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)