On 11/16/2010 07:54 PM, Craig Carl wrote: > On 11/16/2010 03:07 PM, Stephan von Krawczynski wrote: >> which files >> are not in sync in a replication setup? There is no trivial answer to >> this >> question I already brought up in early 2.X development phase... >> How can you sell someone a storage platform if you're unable to >> answer such an >> essential question? Really, nobody needed auto-healing. All you need >> is the >> answer to this question and then stat exactly this file list at a >> time _of >> your choice_. > > On the sync question you brought up that is only an issue in the rare > case of split brain (if I understand the scenario you've brought up). > Split brain is a difficult problem with no answer right now. Gluster > 3.1 added much more aggressive locking to reduce the possibility of > split brain. The process you described as "...the deamons are talking > with each other about whatever..." will also reduce the likelihood of > split brain by eliminating the possibility that client or server vol > files are not the same across the entire cluster, the cause of a vast > majority of split brain issues with Gluster. > Auto heal is slow, we have some processes along the lines you are > thinking, please let me know if these address some of your ideas > around stat - > > #cd <gluster mount> > #find ./ -type f -exec stat /<backend device>?{}? \; this will heal > only the files on that device. > > If you know when you had a failure you want to recover from this is > even faster - > > #cd <gluster mount> > #find ./ -type f -mmin <minutes since failure+ some extra> -exec stat > /<backend device>?{}? \; this will heal only the files on that device > changed x or more minutes ago. See also http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=2088 which is an enhancement request addressing exactly this issue.