--- Krishna Srinivas <krishna@xxxxxxxxxxxxx> wrote: > Since selfheal is done only on demand this issue is > seen. However you can get around this problem if > after bringing up the downed server you do > a "find . > /dev/null" from the root directory > (which would call lookup() on every directory, > lookup() code has the code to fix the issue you are > seeing) Krishna, Thanks, that comfirms what I have seen. The find is a more complete solution than just ls, but this is a client side solution that requires the client to even know that a server has gone down and come up. How would a client know this? I guess a server could be scripted to mount itself as a client when in comes up and automatically run a find on its client view to sync up. This would reduce the amount of time that two servers were out of sync, but it still would not ensure consistency. Inconsistencies cal still occur if server A goes down and server B get written to and the B goes down. When A comes up it will not have B's latest changes if B is still down. Is preventing this in the works? If so, I am curious as to what type of mechanism will be used? Is there a feature "term" used to describe this, it does not seem to be implied by the term 'self-healing' which seems slated for 1.4. Would this be a more advanced feature, what would it be called and when is it planned for? In the meantime, would it be possible to script something that would ensure this? Would there be a hook somewhere to find out when data hasn't been written to all the nodes because one of them is down? If so it seems like it would be possible to script things so that such a node (one without the latest data) would not join a cluster unless other nodes which have the latest data are alive also. -Martin ____________________________________________________________________________________ Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping