On Tue, 15 Sep 2009 20:14:57 -0400 Mark Mielke <mark@xxxxxxxxxxxxxx> wrote: > On 09/15/2009 07:45 PM, Michael Cassaniti wrote: > > Don't try bypassing the mountpoint to perform file operations _period_ > > . You can always have a replicate mountpoint configured on the server > > (i.e. a client for replicate), as well as the server side. NFS should > > run on top of this replicate mountpoint. This (poor) graphic may help. > > Note that everything is running on the same machine: > > Agree. The main feature of GlusterFS in this regard has to do with read > - not write. It is very cool that if GlusterFS with cluster/replicate > completely fails, the backing store is still accessible to recover from. > However, this is not a license to write or re-write the backing story as > we see fit. It should be treated as read-only if it is used at all. > > Note that even read-only does not work if one starts to use the > BerkeleyDB storage method, distribute, or stripe. > > In any case - dropping extended attributes for a system that relies on > extended attributes is a lossy backup, and should be expected to be > invalid if restored into place. Even if the extended attributes are kept > in the backup - I think it only decreases the risk, it does not > eliminate it. Writes really should not bypass the mount point. > > Cheers, > mark Well, Michael, Mark, maybe we should talk more about real setups and less about theory. In theory everything you say makes sense, and clearly your "don't do that" approach is clean. Unfortunately the real world ist not that clean and hardly ever can be bent to be. But fortunately theory looks at setup that are rare in real world. Lets make a trivial setup, lots of data for webservers and some ftp servers for feeding in and deleting old. The first thing in sight: compared to the reads there are very few writes, mostly sequential logfiles. And another thing: most of the data does not get read nor written the whole day long. This is a pretty common example I would say. Since really very few changes are going on compared to the total amount of stored data you may call the situation pseudo-static. What would you expect in that setup? Lets say the bad boys (ftp servers) are local feeds and not going over glusterfs for some unknown reason. What do they really do to the data? They delete (the data is gone afterwards, so there is no problem at all), they write new files. It should be very simple for glusterfs to detect a local fed new file, because it has no xattribs at all (assuming every glusterfs-fed file has some (*)). So basically all you have to do is try to write-lock the file on the backend store, create its xattribs default, unlock and do a stat for self-healing the other subvolumes - lets call such a thing "import". Does that really sound unsolvable? (For simplicity we assume such local feeds only on the first subvolume, and the cluster being replicate) (*) IF not every glusterfs file has xattribs then "import" is even simpler and can be done by just stat'ing. This case sounds pretty automagically happening on first touching of the new file over glusterfs mountpoint. Another story: the backup I am pretty astonished that you all talk about backuping the xattribs. But according to your own clean philosophy there should be no problem for backups without xattribs as long as they are read in from the glusterfs mountpoint. Since other applications do not honor the xattribs either that can only mean that a backup must be a complete snapshot without them. Backup with xattribs in this sense can only be useful at all if read local from the backend store to be able to recover that backend later on - including the information hidden in the xattribs. But since you would not want to deal with local data at all this should be no backup method at all. Even from my bad boy position I would not backup xattribs via local feed. The reason for me lies in restore. If I local-restore a file without xattribs I give glusterfs a realistic change to notice that this is a local fed file and should probably be handled like discussed above ("import"). But if I local-restore a file with xattribs it is likely that these contain a currently invalid state. My guess is that this will harm glusterfs more than not having xattribs for the file at all because there is possibly no good way to find out the invalid state. So how is topic xattribs linked to topic backup according to your opinion? -- Regards, Stephan