HEAL tool will monitor the glusterfs in the same way AFR currently does. The only difference being HEAL is a seperate process. HEAL will contain all the functionalities of self-heal (inside AFR as it exists today). On Mon, Jan 5, 2009 at 11:25 PM, Gordan Bobic <gordan at bobich.net> wrote: > Maybe I'm missing something here, but if you take self-healing out of AFR, > then surely that makes the system completely useless and no better than > running rsync every 5 minutes. Since that can't be right, what am I missing? > > Gordan > > > Anand Babu Periasamy wrote: > >> Christopher, main issue with self-heal is its complexity. Handling >> self-healing >> logic in a non-blocking asynchronous code path is difficult. Replicating a >> missing >> sounds simple, but holding off a lookup call and initiating a new series >> of calls >> to heal the file and then resuming back normal operation is tricky. Much >> of the >> bugs we faced in 1.3 is related to self-heal. We have handled most of >> these cases >> over a period of time. Self-healing is decent now, but not good enough. We >> feel that >> it has only complicated the code base. It is hard to test and maintain >> this part of >> the code base. >> >> Plan is to drop self-heal code all together once the active healing tool >> gets ready. >> Unlike self-healing, this active healing can be run by the user on a >> mounted file system >> (online) any time. By moving the code out of the file system, into a tool >> (that is >> synchronous and linear), we can implement sophisticated healing >> techniques. >> >> Code is not in the repository yet. Hopefully in a month, it will be ready >> for use. >> You can simply turn off self-heal and run this utility while the file >> system is mounted. >> >> List-hacking is an internal list, mostly junk :). It is an internal >> company list. >> We don't discuss technical / architectural stuff there. They are mostly >> done over >> phone and in-person meetings. We do want to actively involve the community >> right >> from the design phase. Mailing list is cumbersome and slow to >> interactively >> brainstorm design discussions. We can once in a while organize IRC >> sessions >> for this purpose. >> >> -- >> Anand Babu >> >> Swank iest wrote: >> >>> Well, >>> >>> I guess this is getting outside of the bug. I suppose you are going to >>> mark it as not going to fix? >>> >>> I'm trying to put gluster into production right now, so may I ask: >>> >>> 1) What are the current issues with self-heal that require a full >>> re-write? Is there a place in the Wiki or elsewhere where it's being >>> documented? >>> 2) May I see the new code? I must not be looking in the correct place in >>> TLA? >>> 3) If it's not written yet, may I be included in the design discussion? >>> (As I haven't put gluster into production yet, now would be a good time to >>> know if it's not going to work in the near future.) >>> 4) May I be placed on the list-hacking at zresearch.com mailing list, >>> please? >>> >>> Christopher. >>> >>> > Date: Mon, 5 Jan 2009 01:36:14 -0800 >>> > From: ab at zresearch.com >>> > To: krishna at zresearch.com >>> > CC: swankier at msn.com; list-hacking at zresearch.com >>> > Subject: Re: [List-hacking] [bug #25207] an rm of a file should not >>> cause that file to be replicated with afr self-heal. >>> > >>> > Krishna, leave it as is. Once self-heal ensures that the volumes are >>> intact, rm will >>> > remove both the copies anyways. It is inefficient, but optimizing it >>> the current framework >>> > will be hacky. >>> > >>> > Swaniker, We are ditching the current self-healing framework with an >>> active healing tool. >>> > We can take care of it then. >>> > >>> > >>> > Krishna Srinivas wrote: >>> >> The current selfheal logic is built in lookup of a file, lookup is >>> >> issued just before any file operation on a file. So if the lookup >>> call >>> >> does not know whether an open or rm is going to be done on the file. >>> >> Will get back to you if we can do anything about this, i.e to save >>> the >>> >> redundant copy of the file when it is going to be rm'ed >>> >> >>> >> Krishna >>> >> >>> >> On Mon, Jan 5, 2009 at 12:19 PM, swankier <INVALID.NOREPLY at gnu.org> >>> wrote: >>> >>> Follow-up Comment #2, bug #25207 (project gluster): >>> >>> >>> >>> I am: >>> >>> >>> >>> 1) delete file from posix system beneath afr on one side >>> >>> 2) run rm on gluster file system >>> >>> >>> >>> file is then replicated followed by deletion >>> >> > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at nongnu.org > http://lists.nongnu.org/mailman/listinfo/gluster-devel > -- gowda -------------- next part -------------- An HTML attachment was scrubbed... URL: http://zresearch.com/pipermail/gluster-users/attachments/20090106/0910bfca/attachment.htm