On Mon, 14 Sep 2009 23:46:16 -0400 Mark Mielke <mark@xxxxxxxxxxxxxx> wrote: > On 09/14/2009 07:28 PM, Stephan von Krawczynski wrote: > > I have problems in understanding exactly the heart of your question. Since the > > main reason for rsyncing the data was to take a backup of the primary server, > > so that self heal would have less to do it is obvious (to me) that it has been > > a subvolume of the replicate. In fact is was a backup of the _only_ subvolume > > (remember we configured a replicate with two servers, where one of them was > > actually not there until we offline fed it with the active servers' data and > > then tried to switch it online in glusterfs. > > > > A potentially valid question here - if the backend storage was a > database as other solutions use, would you expect this to work? No, of course not. > To some degree, rsync from backup is opening up the black box and > shoving stuff in that you think, in theory, should work. No, not really. In fact every other comment about glusterfs(d) reads like "this is a standard application regarding the fs, therefore it cannot be responsible for problem A or bug B". Now, if it is to be judged as one of many applications on the one hand, then it should be able to cope with situations that every standard application can cope with either - other applications using the same fs. _The_ advantage of the whole glusterfs concept is exactly that it is _no_ fs with a own and special disk layout. It (should) run(s) on top of an existing fs that can be used just like a fs may be used - including backup (with rsync or whatever), restore and file operations of any kind. If subvolumes are indeed closed storages then they would be in no way different than nbd, enbd, whatever-nbd. For various reasons we don't want these solutions. So, if you did not backup with rsync -X to preserve xattrs you shall loose some consistency parameters. But you should not loose your ability to restore the data as a whole. At least these are my expectations in straight-forward use of glusterfs. > I don't think this is really the definition of self-heal. I think of > self-heal is repairing damage. Really, you are sending it all new data > (extracting from a lossy backup copy) that happens to indirectly inherit > from previous data, happens to use the same path names, and asking it to > reconcile the differences. What is being saved here? Time and network bandwidth > Even in a self-heal > situation - it's still going to have to re-write the files, unless it is > able to detect that some of the leading blocks are in common and only > send the diffs? The files really are different. If a file has the same path and the same name it has to be the same file - based on the fact that there is no versioning. It may have other content, but that is exactly what we are talking about. The simple question: what is the valid content? I don't want glusterfs to guess or randomize, I want to be able to say: use the copy with the latest mtime. That's about all. I don't expect versioning nor other neat features. Just like I can say now: use the copy from child X. Unfortunately this option brings you in trouble if child X dies, because you have to reconfigure all clients to save the situation. If you could say: hey just always use the latest file version you can find, at least the classical server failures are safe. It is even save to restore old backups to bricks and include them in replication because glusterfs will update them (maybe the word "update" fits better than "self-heal" here). Of course it cannot save you in a real split brain situation. Because there is really nothing that saves you. But all other "standard" downs of a brick look bright afterwards. Not? -- Regards, Stephan