I'm testing GlusterFS viability for use with a typical PHP webapp (ie. lots of small files). I don't care so much for the C in the CAP theorem, as I have very few writes. I could live with a write propagation delay of 5 minutes (or dirty caches for up to 5 minutes). So I'm optimizing for low latency reads of small files. My testsetup is 2 node replication. Each node is both server and gluster client. Both are in sync. I stop glusterfs-server @ node2. @node1, I run a simple benchmark: repeatedly (to prime the cache) open & close 1000 small files. I have enabled the client-side io-cache and quick-read translators (see below for config). The results are consistently 2 ms per open (O_RDONLY) call. Which is too slow, unfortunately, as I need < 0.2ms. The same test with a local Gluster server and NFS mount, I get somewhat better performance but still 0.6ms. The same test with Linux NFS server (v3) and local mount, I get 0.12ms per open. I can't explain the lag using Gluster, because I can't see any traffic being sent to node2. I would expect that using the io-cache translator and local-only operation, the performance would approach that of the kernel FS cache. Is this assumption correct? If yes, how would I profile the client sub system to detect the bottleneck? If no, then I have to accept that 0.8ms open calls are the best that I could squeeze out of this system. Then I'll probably look into AFS, userspace async replication or gluster NFS mount with cachefilesd. Which would you recommend? Thanks a lot! BTW I like Gluster a lot, and hope that it is also suitable for this small files use case ;) //Willem PS Am testing with kernel 3.5.0-17-generic 64bit and gluster 3.2.5-1ubuntu1. Client volfile: +------------------------------------------------------------------------------+ 1: volume testvol-client-0 2: type protocol/client 3: option remote-host g1 4: option remote-subvolume /data 5: option transport-type tcp 6: end-volume 7: 8: volume testvol-client-1 9: type protocol/client 10: option remote-host g2 11: option remote-subvolume /data 12: option transport-type tcp 13: end-volume 14: 15: volume testvol-replicate-0 16: type cluster/replicate 17: subvolumes testvol-client-0 testvol-client-1 18: end-volume 19: 20: volume testvol-write-behind 21: type performance/write-behind 22: option flush-behind on 23: subvolumes testvol-replicate-0 24: end-volume 25: 26: volume testvol-io-cache 27: type performance/io-cache 28: option max-file-size 256KB 29: option cache-timeout 60 30: option priority *.php:3,*:0 31: option cache-size 256MB 32: subvolumes testvol-write-behind 33: end-volume 34: 35: volume testvol-quick-read 36: type performance/quick-read 37: option cache-size 256MB 38: subvolumes testvol-io-cache 39: end-volume 40: 41: volume testvol 42: type debug/io-stats 43: option latency-measurement off 44: option count-fop-hits off 45: subvolumes testvol-quick-read 46: end-volume Server volfile: +------------------------------------------------------------------------------+ 1: volume testvol-posix 2: type storage/posix 3: option directory /data 4: end-volume 5: 6: volume testvol-access-control 7: type features/access-control 8: subvolumes testvol-posix 9: end-volume 10: 11: volume testvol-locks 12: type features/locks 13: subvolumes testvol-access-control 14: end-volume 15: 16: volume testvol-io-threads 17: type performance/io-threads 18: subvolumes testvol-locks 19: end-volume 20: 21: volume testvol-marker 22: type features/marker 23: option volume-uuid bc89684f-569c-48b0-bc67-09bfd30ba253 24: option timestamp-file /etc/glusterd/vols/testvol/marker.tstamp 25: option xtime off 26: option quota off 27: subvolumes testvol-io-threads 28: end-volume 29: 30: volume /data 31: type debug/io-stats 32: option latency-measurement off 33: option count-fop-hits off 34: subvolumes testvol-marker 35: end-volume 36: 37: volume testvol-server 38: type protocol/server 39: option transport-type tcp 40: option auth.addr./data.allow * 41: subvolumes /data 42: end-volume My benchmark to simulate PHP webapp i/o: #!/usr/bin/env python import sys import os import time import optparse def print_timing(func): def wrapper(*arg): t1 = time.time() res = func(*arg) t2 = time.time() print '%-15.15s %6d ms' % (func.func_name, int ( (t2-t1)*1000.0 )) return res return wrapper def parse_options(): parser = optparse.OptionParser() parser.add_option("--path", '-p', default="/mnt/glusterfs", help="Base directory for running tests (default: /mnt/glusterfs)", ) parser.add_option("--num", '-n', type="int", default=100, help="Number of files per test (default: 100)", ) (options, args) = parser.parse_args() return options class FSBench(): def __init__(self,path="/tmp",num=100): self.path = path self.num = num @print_timing def test_open_read(self): for filename in self.get_files(): f = open(filename) data = f.read() f.close() def get_files(self): for i in range(self.num): filename = self.path + "/test_%03d" % i yield filename @print_timing def test_stat(self): for filename in self.get_files(): os.stat(filename) @print_timing def test_stat_nonexist(self): for filename in self.get_files(): try: os.stat(filename+"blkdsflskdf") except OSError: pass @print_timing def test_write(self): for filename in self.get_files(): f = open(filename,'w') f.write('hi there\n') f.close() @print_timing def test_delete(self): for filename in self.get_files(): os.unlink(filename) if __name__ == '__main__': options = parse_options() bench = FSBench(path=options.path, num=options.num) bench.test_write() bench.test_open_read() bench.test_stat() bench.test_stat_nonexist() bench.test_delete() -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130418/9b8ff478/attachment.html>