Hi,
We are thinking different approaches to add support in
Geo-replication for Sharded Gluster Volumes[1]
Approach 1: Geo-rep: Sync
Full file
- In Changelog only record
main file details in the same brick where it is created
- Record as DATA in
Changelog whenever any addition/changes to the sharded file
- Geo-rep rsync will do
checksum as a full file from mount and syncs as new file
- Slave side sharding is
managed by Slave Volume
Approach 2: Geo-rep: Sync
sharded file separately
- Geo-rep rsync will do
checksum for sharded files only
- Geo-rep syncs each
sharded files independently as new files
- [UNKNOWN] Sync internal
xattrs(file size and block count) in the main sharded file to
Slave Volume to maintain the same state as in Master.
- Sharding translator to
allow file creation under .shards dir for gsyncd. that is Parent
GFID is .shards directory
- If sharded files are
modified during Geo-rep run may end up stale data in Slave.
- Files on Slave Volume may not be readable unless all
sharded files sync to Slave(Each bricks in Master independently
sync files to slave)
First approach looks more clean, but we have to analize the Rsync
checksum performance on big files(Sharded in backend, accessed as
one big file from rsync)
Let us know your thoughts. Thanks
Ref:
[1]
http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
--
regards
Aravinda
|
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel