On 12/06/2013 01:57 PM, Kal Black wrote: > Hello, > I am in the point of picking up a FS for new brick nodes. I was used to > like and use ext4 until now but I recently red for an issue introduced by a > patch in ext4 that breaks the distributed translator. In the same time, it > looks like the recommended FS for a brick is no longer ext4 but XFS which > apparently will also be the default FS in the upcoming RedHat7. On the > other hand, XFS is being known as a file system that can be easily > corrupted (zeroing files) in case of a power failure. Supporters of the > file system claim that this should never happen if an application has been > properly coded (properly committing/fsync-ing data to storage) and the > storage itself has been properly configured (disk cash disabled on > individual disks and battery backed cache used on the controllers). My > question is, should I be worried about losing data in a power failure or > similar scenarios (or any) using GlusterFS and XFS? Are there best > practices for setting up a Gluster brick + XFS? Has the ext4 issue been > reliably fixed? (my understanding is that this will be impossible unless > ext4 isn't being modified to allow popper work with Gluster) > Hi Kal, You are correct in that Red Hat recommends using XFS for gluster bricks. I'm sure there are plenty of ext4 (and other fs) users as well, so other users should chime in as far as real experiences with various brick filesystems goes. Also, I believe the dht/ext issue has been resolved for some time now. With regard to "XFS zeroing files on power failure," I'd suggest you check out the following blog post: http://sandeen.net/wordpress/computers/xfs-does-not-null-files-and-requires-no-flux/ My cursory understanding is that there were apparently situations where the inode size of a recently extended file would be written to the log before the actual extending data is written to disk, thus creating a crash window where the updated size would be seen, but not the actual data. In other words, this isn't a "zeroing files" behavior in as much as it is an ordering issue with logging the inode size. This is probably why you've encountered references to fsync(), because with the fix your data is still likely lost (unless/until you've run an fsync to flush to disk), you just shouldn't see the extended inode size unless the actual data made it to disk. Also note that this was fixed in 2007. ;) Brian > Best regards > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://supercolony.gluster.org/mailman/listinfo/gluster-users > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users