On Tue, 16 Nov 2010 16:54:07 -0800 Craig Carl <craig at gluster.com> wrote: > > Stephan - > Based on your feedback, and from other members of the community we have > opened discussions internally around adding support for a 32-bit client. > We have not made a decision at this point, and I can't make any > guarantees but I will do my best to get it added to the next version of > the product (3.1.2, (3.1.1 is feature locked)). > On the sync question you brought up that is only an issue in the rare > case of split brain (if I understand the scenario you've brought up). > Split brain is a difficult problem with no answer right now. Gluster 3.1 > added much more aggressive locking to reduce the possibility of split > brain. The process you described as "...the deamons are talking with > each other about whatever..." will also reduce the likelihood of split > brain by eliminating the possibility that client or server vol files are > not the same across the entire cluster, the cause of a vast majority of > split brain issues with Gluster. > Auto heal is slow, we have some processes along the lines you are > thinking, please let me know if these address some of your ideas around > stat - > > #cd <gluster mount> > #find ./ -type f -exec stat /<backend device>?{}? \; this will heal only > the files on that device. > > If you know when you had a failure you want to recover from this is even > faster - > > #cd <gluster mount> > #find ./ -type f -mmin <minutes since failure+ some extra> -exec stat > /<backend device>?{}? \; this will heal only the files on that device > changed x or more minutes ago. > > > Thanks, > > Craig Hello Craig, let me repeat a very old suggestion (in fact I believe it was before your time at gluster). I suggested to create a module (for server) that does only one thing: maintain a special file in a way that a filename (with path) is added to it when the server sets acls meaning the file is currently not in sync. When acls are set to the file that mean it is in sync remove the filename from the list again. Lets say this special file is named "/.glusterfs-<server-ip>" (root of the mounted glusterfs). Now that would allow you to have a look at _all_ files on _all_ servers not in sync from the clients view. All you had to do for healing is to stat only these filelists and you are done. You can simply drop the auto-healing, because you could as well do a cronjob for that now as there is no "find" involved the whole method uses virtually no resources on the servers and clients. You have full control, you know what files on what servers are out-of-sync. This solves all possible questions around replication. Regards, Stephan