On Wed, Sep 2, 2015 at 8:09 PM, Shyam <srangana@xxxxxxxxxx> wrote: > On 09/02/2015 03:12 AM, Aravinda wrote: >> >> Geo-replication and Sharding Team today discussed about the approach >> to make Sharding aware Geo-replication. Details are as below >> >> Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay Bellur >> >> - Both Master and Slave Volumes should be Sharded Volumes with same >> configurations. > > > If I am not mistaken, geo-rep supports replicating to a non-gluster local FS > at the slave end. Is this correct? If so, would this limitation not make > that problematic? That was taken out when distributed geo-replication was developed (with support for GFID synchronization b/w master and slave). Slave therefore needs to be a Gluster volume. > > When you state *same configuration*, I assume you mean the sharding > configuration, not the volume graph, right? > >> - In Changelog record changes related to Sharded files also. Just like >> any regular files. >> - Sharding should allow Geo-rep to list/read/write Sharding internal >> Xattrs if Client PID is gsyncd(-1) >> - Sharding should allow read/write of Sharded files(that is in .shards >> directory) if Client PID is GSYNCD >> - Sharding should return actual file instead of returning the >> aggregated content when the Main file is requested(Client PID >> GSYNCD) >> >> For example, a file f1 is created with GFID G1. >> >> When the file grows it gets sharded into chunks(say 5 chunks). >> >> f1 G1 >> .shards/G1.1 G2 >> .shards/G1.2 G3 >> .shards/G1.3 G4 >> .shards/G1.4 G5 >> >> In Changelog, this is recorded as 5 different files as below >> >> CREATE G1 f1 >> DATA G1 >> META G1 >> CREATE G2 PGS/G1.1 >> DATA G2 >> META G1 >> CREATE G3 PGS/G1.2 >> DATA G3 >> META G1 >> CREATE G4 PGS/G1.3 >> DATA G4 >> META G1 >> CREATE G5 PGS/G1.4 >> DATA G5 >> META G1 >> >> Where PGS is GFID of .shards directory. >> >> Geo-rep will create these files independently in Slave Volume and >> syncs Xattrs of G1. Data can be read only when all the chunks are >> synced to Slave Volume. Data can be read partially if main/first file >> and some of the chunks synced to Slave. >> >> Please add if I missed anything. C & S Welcome. >> >> regards >> Aravinda >> >> On 08/11/2015 04:36 PM, Aravinda wrote: >>> >>> Hi, >>> >>> We are thinking different approaches to add support in Geo-replication >>> for Sharded Gluster Volumes[1] >>> >>> *Approach 1: Geo-rep: Sync Full file* >>> - In Changelog only record main file details in the same brick >>> where it is created >>> - Record as DATA in Changelog whenever any addition/changes to the >>> sharded file >>> - Geo-rep rsync will do checksum as a full file from mount and >>> syncs as new file >>> - Slave side sharding is managed by Slave Volume >>> *Approach 2: Geo-rep: Sync sharded file separately* >>> >>> - Geo-rep rsync will do checksum for sharded files only >>> - Geo-rep syncs each sharded files independently as new files >>> - [UNKNOWN] Sync internal xattrs(file size and block count) in the >>> main sharded file to Slave Volume to maintain the same state as in >>> Master. >>> - Sharding translator to allow file creation under .shards dir for >>> gsyncd. that is Parent GFID is .shards directory >>> - If sharded files are modified during Geo-rep run may end up stale >>> data in Slave. >>> - Files on Slave Volume may not be readable unless all sharded >>> files sync to Slave(Each bricks in Master independently sync files to >>> slave) >>> >>> First approach looks more clean, but we have to analize the Rsync >>> checksum performance on big files(Sharded in backend, accessed as one >>> big file from rsync) >>> >>> Let us know your thoughts. Thanks >>> >>> Ref: >>> [1] >>> >>> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator >>> -- >>> regards >>> Aravinda >>> >>> >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel@xxxxxxxxxxx >>> http://www.gluster.org/mailman/listinfo/gluster-devel >> >> >> >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@xxxxxxxxxxx >> http://www.gluster.org/mailman/listinfo/gluster-devel >> > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel