Thursday 28 of August 2008 12:39:03 napisałeś(-łaś): > On Thu, Aug 28, 2008 at 3:01 PM, Łukasz Mierzwa <l.mierzwa@xxxxxxxxx> wrote: > > Thursday 28 of August 2008 07:06:30 Krishna Srinivas napisał(a): > >> On Wed, Aug 27, 2008 at 10:55 PM, Łukasz Mierzwa <l.mierzwa@xxxxxxxxx> > > > > wrote: > >> > Tuesday 26 August 2008 16:28:41 Łukasz Mierzwa napisał(a): > >> >> Hi, > >> >> > >> >> I testing glusterfs for small files storage, first I've setup a > >> >> single disk gluster server, connected to it from another machine and > >> >> served those files with nginx. That worked ok, I got good > >> >> performance, on average about +20ms slower for each request but > >> >> that's ok. Now I've setup unify over afr (2 afr groups with 3 servers > >> >> each, unify and afr on the client side, namespace dir is on every > >> >> server, as other stuff afr'ed on the client side), this is mounted on > >> >> one of those 6 servers. After writing ~200GB files from production > >> >> server I started to do some tests and I've noticed that doing simple > >> >> ls on that mount point causes as many writes as reads, this has to do > >> >> something to either unify or afr, I suspect that those writes are do > >> >> to namespace but I need to do more debugging. It's very annoying that > >> >> simple reads are causing so many writes. All my servers are in sync > >> >> so there should not be any need for sealf-healing. Before I start > >> >> debugging it I wanted to ask if this is normal? Shoud afr or unify > >> >> generate so many writes to namespace or maybe xattrs during reads > >> >> (storage is on ext3 with users_xattrs on)? > >> > > >> > I tested it a little bit today and I found out that if I got 1 or 2 > >> > nodes in my afr group for namespace there are no writes at all while > >> > doing ls, if I add one or more nodes they are starting to get writes. > >> > WTF? > >> > >> Do you mean that your NS is getting write() calls when you do "ls"? > > > > It seems so. I will split my NS and DATA bricks to different disks today > > so I will be 100% sure. What I am sure now is that I am getting as many > > writes as reads when I do "ls" and have more than 2 NS bricks in AFR. > > reads/writes should not happen when you do an 'ls' where are you seeing > reads and writes being done? How are you seeing it? are you strace'ing > the glusterfsd? > > Krishna I first noticed them when I looked at rrd graphs for those machines, I wanted to see if AFR is balancing reads. I can see them in rrd graphs generated from collectd, dstat, iotop and iostat, they are happening. I first tried to find something in my config and forgot about such obvious step as straceing glusterfs. I attach log from one of the servers, I straced gluster-server on this machine, You can see that there is a lot of mkdir/chown/chmod on files that are already there, all bricks were online when I was writing files to gluster client so no self-heal should be needed. I've also attached client and server configs. -- z poważaniem Łukasz Mierzwa Administrator Sieci Grono.net Spółka Akcyjna ul. Szturmowa 2a, 02-678 Warszawa Sąd Rejonowy dla m.st. Warszawy, XIII Wydział Gospodarczy; KRS 0000228856, NIP: 929-173-90-15 Regon: 080019856 Kapitał zakładowy: 550.000,00 złotych http://grono.net Treść tej wiadomości jest poufna i prawnie chroniona. Odbiorca może być jedynie jej adresat z wyłączeniem dostępu osób trzecich. Jeżeli nie jesteś adresatem niniejszej wiadomości, jej rozpowszechnianie, kopiowanie, rozprowadzanie lub inne działanie o podobnym charakterze jest prawnie zabronione i może by karalne. Jeżeli wiadomość ta trafiła do Ciebie omyłkowo, uprzejmie prosimy o odesłanie jej na adres nadawcy i usunięcie.
volume brick_40 type protocol/client option transport-type tcp/client option remote-host 192.168.1.40 option remote-subvolume brick-writebehind end-volume volume ns_40 type protocol/client option transport-type tcp/client option remote-host 192.168.1.40 option remote-subvolume brick-ns end-volume volume brick_41 type protocol/client option transport-type tcp/client option remote-host 192.168.1.41 option remote-subvolume brick-writebehind end-volume volume ns_41 type protocol/client option transport-type tcp/client option remote-host 192.168.1.41 option remote-subvolume brick-ns end-volume volume brick_42 type protocol/client option transport-type tcp/client option remote-host 192.168.1.42 option remote-subvolume brick-writebehind end-volume volume ns_42 type protocol/client option transport-type tcp/client option remote-host 192.168.1.42 option remote-subvolume brick-ns end-volume volume brick_43 type protocol/client option transport-type tcp/client option remote-host 192.168.1.43 option remote-subvolume brick-writebehind end-volume volume ns_43 type protocol/client option transport-type tcp/client option remote-host 192.168.1.43 option remote-subvolume brick-ns end-volume volume brick_44 type protocol/client option transport-type tcp/client option remote-host 192.168.1.44 option remote-subvolume brick-writebehind end-volume volume ns_44 type protocol/client option transport-type tcp/client option remote-host 192.168.1.44 option remote-subvolume brick-ns end-volume volume brick_45 type protocol/client option transport-type tcp/client option remote-host 192.168.1.45 option remote-subvolume brick-writebehind end-volume volume ns_45 type protocol/client option transport-type tcp/client option remote-host 192.168.1.45 option remote-subvolume brick-ns end-volume volume afr_1 type cluster/afr option self-heal on subvolumes brick_40 brick_41 brick_42 end-volume volume afr_2 type cluster/afr option self-heal on subvolumes brick_43 brick_44 brick_45 end-volume volume afr_ns type cluster/afr subvolumes ns_40 ns_41 ns_42 ns_43 ns_44 ns_45 end-volume volume unify type cluster/unify option namespace afr_ns option scheduler alu option alu.limits.min-free-disk 5% option alu.order disk-usage:read-usage:write-usage:open-files-usage:disk-speed-usage option alu.disk-usage.entry-threshold 4GB option alu.disk-usage.exit-threshold 500MB option alu.read-usage.entry-threshold 20% option alu.read-usage.exit-threshold 5% option alu.write-usage.entry-threshold 20% option alu.write-usage.exit-threshold 5% option alu.stat-refresh.interval 30sec option alu.stat-refresh.num-file-create 200 subvolumes afr_1 afr_2 end-volume volume iothreads type performance/io-threads option thread-count 2 # deault is 1 option cache-size 64MB #64MB subvolumes unify end-volume volume readahead type performance/read-ahead option page-size 128kB option page-count 4 subvolumes iothreads end-volume volume writebehind type performance/write-behind option aggregate-size 1MB option flush-behind on subvolumes readahead end-volume volume io-cache type performance/io-cache option cache-size 256MB option force-revalidate-timeout 600 subvolumes writebehind end-volume
volume brick-data type storage/posix option directory /home/gluster/data end-volume volume brick-ns type storage/posix option directory /home/gluster/ns end-volume volume brick-locks type features/posix-locks subvolumes brick-data end-volume volume brick-iothreads type performance/io-threads option thread-count 4 option cache-size 64MB subvolumes brick-locks end-volume volume brick-readahead type performance/read-ahead subvolumes brick-iothreads end-volume volume brick-writebehind type performance/write-behind option aggregate-size 1MB option flush-behind on subvolumes brick-readahead end-volume volume server type protocol/server subvolumes brick-ns brick-writebehind option transport-type tcp/server option auth.ip.brick-ns.allow * option auth.ip.brick-writebehind.allow * end-volume