distributed raid was abandoned by redhat. the only available version is for kernel 2.6.11 best regards 2007/10/17, Alexey Filin <alexey.filin@xxxxxxxxx>: > Hi Kevan, > > consistency of afr'ed files is important question as of failures in backend > fs too, afr is a medicine against node failures not backend fs ones (at > least not directly), in the last case files can be changed "legally" in > bypass glusterfs by fsck after a hw/sw failure and the changes have to be > handled for corrupted replica, else reading of the same file can give > different data (especialy for forthcoming load balanced read of replicas). > Fortunately rsync'ing of original must create consistent replica in the case > too (if cluster/stripe under afr works equally with replicas), unfortunately > extended attributes aren't rsync'ed (I tested it) what can be required > during repairing. > > It seems glusterfs could try to handle hw/sw failures in backend fs with > checksums in extended attributes and checksums are to be calculated for file > chunks (because one checksum requires full recalculation after > appending/changing of one byte to/in a gigabyte file) in the case glusterfs > has to recalculate checksums of all files on corrupted fs (may be toooo > long, it is the same case with rsync'ing) or get list of corrupted files > from backend fs in some way (e.g. with a flag set by fsck in extended > attributes). May be some kind of distributed raid is a better solution, > first step in the direction was done already by cluster/stripe > (unfortunately one of implementations, DDRaid > http://sources.redhat.com/cluster/ddraid/ by Daniel Phillips seems to be > suspended), perhaps it is too computational/network intensive and raid under > backend fs is the best solution even taking into account disk space > overhead. > > I'm very interested to hear thoughts about it from glusterfs developers to > clear my misunderstanding. > > Regards, Alexey. > > On 10/16/07, Kevan Benson <kbenson@xxxxxxxxxxxxxxx> wrote: > > > > > > When an afr encounters a file that exists on multiple shares that > > doesn't have the trusted.afr.version set, it sets that attribute for sll > > the files and assumes they contain the same data. > > > > I.e. if you manually create the files on the servers directly and with > > different content, appending to the file through the client will set the > > trusted.afr.version for both files, and append to both files, but the > > files still contain different content (the content from before the > > append). > > > > Now, this would be really hard to replicate without this arbitrary > > example, it would probably require a write fail to all afr subvolumes, > > possibly at different times of the write operation, in which case the > > file content can't be trusted anyways, so it's really not a big deal. I > > only mention it in case it might not be the desired behavior, and > > because it might be useful to have the first specified afr subvolume > > supply the file to the others in the case that none has the > > trusted.afr.version attribute set in cases of pre-populating the share > > (such as rsyncs from a dynamic source). The problem is easily mitigated > > (rsync to a single share and trigger a self-heal or rsync to the client > > mount point), I just figured I'd mention it, and that's only required if > > you really NEED pre-population of data. > > > > -- > > > > -Kevan Benson > > -A-1 Networks > > > > > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@xxxxxxxxxx > > http://lists.nongnu.org/mailman/listinfo/gluster-devel > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx > http://lists.nongnu.org/mailman/listinfo/gluster-devel > -- Leonardo Rodrigues de Mello jabber: l@xxxxxxxxxxxxx