Hi, Mr.Freedman. Thanks for replying. >At 09:26 PM 12/15/2008, Keisuke TAKAHASHI wrote: >>Hi. >>I'm using GlusterFS v1.3.12 (glusterfs-1.3.12.tar.gz) via FUSE >>(fuse-2.7.3glfs10.tar.gz) on CentOS 5.2 x86_64 (Linux kernel >>2.6.18-92.el5) now. >>The nodes are HP Proliant DL360 G5 (as GlusterFS Client) and DL180 >>G5 (as GlusterFS Servers). >>And the connections are all TCP/IP on Gigabit ethernet. >> >>Then, I tested self-heal and I found a technical problem about >>"replace" -- self-heal after a node's fault and others' >>file-contents decreasing leaves garbage. >>I would like you to show me ideas to resolve or avoid it. >> >>First, my GlusterFS's construction is following: >> - 1 GlusterFS Client (client) and 3 GlusterFS Servers >> (server1,server2,server3) >> - using cluster/unify to add GlusterFS Servers >> - using cluster/afr between 3 GlsuterFS Servers underneath the >> cluster/unify >> - namespace volume is on the GlusterFS Client >> >>So, self-heal will behave between server1, server2 and server3. >> >>Now, my self-healing procedure of fault scenario is following: >> (1) Each node is active and mount point on client is >> /mnt/glusterfs. The operating user on client is root. >> (2) Root creates fileA and fileBC on the client local directory >> (not on the mount point of FUSE) >> - fileA contains strings "aaa" >> - fileBC contains strings "bbb\nccc" (\n is line break.) >> (3) Root copies fileBC on /mnt/glusterfs. >> (4) Make server2 down. (# ifdown eth0) >> (5) Root redirects fileA into fileBC (# cat fileA > fileBC) >> (6) Make server2 up. (# ifup eth0) >> (7) Now, the status of fileBC on servers is below: >> - server1: fileBC contains "aaa", trusted.glusterfs.version is 3 >> - server2: fileBC contains "bbb\nccc", trusted.glusterfs.version is 2 >> - server3: fileBC contains "aaa", trusted.glusterfs.version is 3 >> (8) Execute self-heal. (# find /mnt/glusterfs -type f -print0 | >> xargs -0 head -c1 >/dev/null) > >on which server did you run this. it seems to matter for some reason >from what I can tell. if it's run from the server that has the new >version alls well but otherwise, sometimes afr doesnt work (although >this is likely fixed in the newer versions, I haven't specifically tested) > I did it on client. So (9) fileBC on server2 was self-healed. >> (9) Then, the status of fileBC on servers is below: >> - server1: fileBC contains "aaa", trusted.glusterfs.version is 3 >> - server2: fileBC contains "aaa\nccc", trusted.glusterfs.version is 3 >> - server3: fileBC contains "aaa", trusted.glusterfs.version is 3 >> >>All right, fileBC on server2 was overwritten by others, but the >>result of "replace" seems in bit sequence (because original fileBC's >>"bbb" was replaced by "aaa" but "\nccc" was left). >>In this case, the part of contents "\nccc" in fileBC on server2 looks >>garbage. >>I would like self-heal to replace old file(s) with new file(s) completely. > >you actually wouldn't want this.. Imagine of the file were a 30GB >log file and all you really care about are the new bits. what's >better is if it does an rsync like update of the file which it seems >to be doing but then forgetting to mark the end of file position. > I really understand it. But, on my GlusterFS, intended data type or size, or usage, are not cut-and-dried now. So I should estimate the case like this. >>Can self-heal do it? Or is there any good idea to resolve it? > >I'd run your test with 1.4rc2 and see if you have the same problem. > Thanks a lot. I also try it. Regards, Keisuke Takahashi _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ Keisuke TAKAHASHI / NTTPC Communications,Inc. E-Mail: keith at NOSPAM.nttpc.co.jp http://www.nttpc.co.jp/english/index.html _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/