This is a continuation of my previous posts about improving write perf when trapping millions of small writes to a gluster filesystem. I was able to improve write perf by ~30x by running STDOUT thru gzip to consolidate and reduce the output stream. Today, another similar problem, having to do with yet another bioinformatics program (which these days typically handle the 'short reads' that come out of the majority of sequencing hardware, each read being 30-150 characters, with some metadata typically in an ASCII file containing millions of such entries). Reading them doesn't seem to be a problem (at least on our systems) but writing them is quite awful.. The program is called 'art_illumina' from the Broad Inst's 'ALLPATHS' suite and it generates an artificial Illumina data set from an input genome. In this case about 5GB of the type of data described above. Like before, the gluster process goes to >100% and the program itself slows to ~20-30% of a CPU. In this case, the app's output cannot be extrnally trapped by redirecting thru gzip since the output flag specifies the base filename for 2 files that are created internally and then written directly. This prevents even setting up a named pipe to trap and process the output. Since this gluster storage was set up specifically for bioinformatics, this is a repeating problem and while some of the issues can be dealt with by trapping and converting output, it would be VERY NICE if we could deal with it at the OS level. The gluster volume is running over IPoIB on QDR IB and looks like this: Volume Name: gl Type: Distribute Volume ID: 21f480f7-fc5a-4fd8-a084-3964634a9332 Status: Started Number of Bricks: 8 Transport-type: tcp,rdma Bricks: Brick1: bs2:/raid1 Brick2: bs2:/raid2 Brick3: bs3:/raid1 Brick4: bs3:/raid2 Brick5: bs4:/raid1 Brick6: bs4:/raid2 Brick7: bs1:/raid1 Brick8: bs1:/raid2 Options Reconfigured: performance.write-behind-window-size: 1024MB performance.flush-behind: on performance.cache-size: 268435456 nfs.disable: on performance.io-cache: on performance.quick-read: on performance.io-thread-count: 64 auth.allow: 10.2.*.*,10.1.*.* I've tried to increase every caching option that might improve this kind of performance, but it doesn't seem to help. At this point, I'm wondering whether changing the client (or server) kernel parameters will help. The client's meminfo is: cat /proc/meminfo MemTotal: 529425924 kB MemFree: 241833188 kB Buffers: 355248 kB Cached: 279699444 kB SwapCached: 0 kB Active: 2241580 kB Inactive: 278287248 kB Active(anon): 190988 kB Inactive(anon): 287952 kB Active(file): 2050592 kB Inactive(file): 277999296 kB Unevictable: 16856 kB Mlocked: 16856 kB SwapTotal: 563198732 kB SwapFree: 563198732 kB Dirty: 1656 kB Writeback: 0 kB AnonPages: 486876 kB Mapped: 19808 kB Shmem: 164 kB Slab: 1475476 kB SReclaimable: 1205944 kB SUnreclaim: 269532 kB KernelStack: 5928 kB PageTables: 27312 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 827911692 kB Committed_AS: 536852 kB VmallocTotal: 34359738367 kB VmallocUsed: 1227732 kB VmallocChunk: 33888774404 kB HardwareCorrupted: 0 kB AnonHugePages: 376832 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 201088 kB DirectMap2M: 15509504 kB DirectMap1G: 521142272 kB and the server's meminfo is: $ cat /proc/meminfo MemTotal: 32861400 kB MemFree: 1232172 kB Buffers: 29116 kB Cached: 30017272 kB SwapCached: 44 kB Active: 18840852 kB Inactive: 11772428 kB Active(anon): 492928 kB Inactive(anon): 75264 kB Active(file): 18347924 kB Inactive(file): 11697164 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 16382900 kB SwapFree: 16382680 kB Dirty: 8 kB Writeback: 0 kB AnonPages: 566876 kB Mapped: 14212 kB Shmem: 1276 kB Slab: 429164 kB SReclaimable: 324752 kB SUnreclaim: 104412 kB KernelStack: 3528 kB PageTables: 16956 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 32813600 kB Committed_AS: 3053096 kB VmallocTotal: 34359738367 kB VmallocUsed: 340196 kB VmallocChunk: 34342345980 kB HardwareCorrupted: 0 kB AnonHugePages: 200704 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 6656 kB DirectMap2M: 2072576 kB DirectMap1G: 31457280 kB Does this suggest any approach? Is there a doc that suggests optimal kernel parameters for gluster? I guess the only other option is to use the glusterfs as an NFS mount and use the NFS client's caching..? That will help on a single process but decrease the overall cluster bandwidth considerably. -- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)