Hi Krishna/Kevan, I tested with the glusterfs--mainline--2.5--patch-515 and Self heal is working fine without any hiccups. regards, On 10/6/07, Krishna Srinivas <krishna@xxxxxxxxxxxxx> wrote: > > Hi Kevan, > > This particular case is failing with older kernel+fuse versions which > send mknod+open for create call. I will fix the issue and let you know. > > Thanks > Krishna > > On 10/4/07, Kevan Benson <kbenson@xxxxxxxxxxxxxxx> wrote: > > Krishna Srinivas wrote: > > > On 10/4/07, Kevan Benson <kbenson@xxxxxxxxxxxxxxx> wrote: > > > > > >> Is self heal supposed to work with partial files? I have an issue > where > > >> self-heal isn't happening on some servers with AFR and unify in a HA > > >> setup I developed. Two servers, two clients, all AFR and unify done > on > > >> client side. > > >> > > >> If I kill a connection while a large file is being written, the > > >> glusterfs mount waits the appropriate timeout period (10 seconds in > my > > >> case) and then finishes writing the file to the still active server. > > >> This results in a full file on one server and a partial file on the > > >> other (the one I stopped traffic to temporarily to simulate a > > >> crash/network problem). If I then enable the disabled server and > read > > >> data from the problematic file, it doesn't self-heal itself and move > > >> the full file to the server with the partial file. > > >> > > >> Anything written entirely while a server is offline (i.e. the offline > > >> server has no knowledge of it) is correctly created on read from the > > >> file, so the problem seems to be related to files that are partially > > >> written to one server. > > >> > > >> Can someone comment on the particular conditions that cause a self > > >> heal? Is there something I can do to force it to self heal at this > > >> point (I repeat that reading data from the file does not work). I > know > > >> I can use rsync and some foo to fix this, but that becomes less and > less > > >> feasible as the mount size grows and the time for rsync to compare > sides > > >> lengthens. > > >> > > >> > > >> _______________________________________________ > > >> Gluster-devel mailing list > > >> Gluster-devel@xxxxxxxxxx > > >> http://lists.nongnu.org/mailman/listinfo/gluster-devel > > >> > > >> > > > > > > Hi Kevan, > > > > > > It should have worked fine in your case. What version of glusterfs are > > > you using? Just before you do the second read (or open rather) which > > > should have triggered self-heal can you do getfattr -n > trusted.afr.version <> > > > on the partial file and also the full file in the backend and give > the output? > > > > > > Thanks > > > Krishna > > > > > > > > > > Glusterfs TLA 504, fuse-2.7.0-gfs4. > > > > The trusted.afr.version attribute doesn't exist on the partial file, it > > does exist on the complete file (with value "1"). From what I just > > tested, it doesn't look like it's set until the file operation is > > complete (it doesn't exist during writing). Are files without this > > attribute assumed to have a value of "0" or something to ensure that > > they participate in self-heal correctly? > > > > It doesn't look like it, as if I append data to the file, the partial > > version gets assigned a trusted.afr.version=1, while the complete file's > > trusted.afr.version is incremented to 2. Self heal now works for that > > file, and on a read of file data the partial file is updated with all > > data and the trusted.afr.version is set to 2. > > > > > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@xxxxxxxxxx > > http://lists.nongnu.org/mailman/listinfo/gluster-devel > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx > http://lists.nongnu.org/mailman/listinfo/gluster-devel > -- Raghavendra G A centipede was happy, until a toad in fun, Said, "Prey, which leg comes after which?", This raised his doubts to such a pitch, He fell flat into the ditch, Unable to know how to run. -Anonymous