On Wed, Sep 2, 2015 at 11:39 PM, Aravinda <avishwan@xxxxxxxxxx> wrote: > > On 09/02/2015 11:13 PM, Shyam wrote: >> >> On 09/02/2015 10:47 AM, Krutika Dhananjay wrote: >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> *From: *"Shyam" <srangana@xxxxxxxxxx> >>> *To: *"Aravinda" <avishwan@xxxxxxxxxx>, "Gluster Devel" >>> <gluster-devel@xxxxxxxxxxx> >>> *Sent: *Wednesday, September 2, 2015 8:09:55 PM >>> *Subject: *Re: Gluster Sharding and Geo-replication >>> >>> On 09/02/2015 03:12 AM, Aravinda wrote: >>> > Geo-replication and Sharding Team today discussed about the >>> approach >>> > to make Sharding aware Geo-replication. Details are as below >>> > >>> > Participants: Aravinda, Kotresh, Krutika, Rahul Hinduja, Vijay >>> Bellur >>> > >>> > - Both Master and Slave Volumes should be Sharded Volumes with >>> same >>> > configurations. >>> >>> If I am not mistaken, geo-rep supports replicating to a non-gluster >>> local FS at the slave end. Is this correct? If so, would this >>> limitation >>> not make that problematic? >>> >>> When you state *same configuration*, I assume you mean the sharding >>> configuration, not the volume graph, right? >>> >>> That is correct. The only requirement is for the slave to have shard >>> translator (for, someone needs to present aggregated view of the file to >>> the READers on the slave). >>> Also the shard-block-size needs to be kept same between master and >>> slave. Rest of the configuration (like the number of subvols of DHT/AFR) >>> can vary across master and slave. >> >> >> Do we need to have the sharded block size the same? As I assume the file >> carries an xattr that contains the size it is sharded with >> (trusted.glusterfs.shard.block-size), so if this is synced across, it would >> do. If this is true, what it would mean is that "a sharded volume needs a >> shard supported slave to ge-rep to". > > Yes. Number of bricks and replica count can be different. But sharded block > size should be same. Only the first file will have > xattr(trusted.glusterfs.shard.block-size), Geo-rep should sync this xattr > also to Slave. Only Gsyncd can read/write the sharded chunks. Sharded Slave > Volume is required to understand these chunks when read(non Gsyncd clients) Even if this works I am very much is disagreement with this mechanism of synchronization (not that I have a working solution in my head as of now). > >> >>> >>> -Krutika >>> >>> >>> >>> > - In Changelog record changes related to Sharded files also. Just >>> like >>> > any regular files. >>> > - Sharding should allow Geo-rep to list/read/write Sharding >>> internal >>> > Xattrs if Client PID is gsyncd(-1) >>> > - Sharding should allow read/write of Sharded files(that is in >>> .shards >>> > directory) if Client PID is GSYNCD >>> > - Sharding should return actual file instead of returning the >>> > aggregated content when the Main file is requested(Client PID >>> > GSYNCD) >>> > >>> > For example, a file f1 is created with GFID G1. >>> > >>> > When the file grows it gets sharded into chunks(say 5 chunks). >>> > >>> > f1 G1 >>> > .shards/G1.1 G2 >>> > .shards/G1.2 G3 >>> > .shards/G1.3 G4 >>> > .shards/G1.4 G5 >>> > >>> > In Changelog, this is recorded as 5 different files as below >>> > >>> > CREATE G1 f1 >>> > DATA G1 >>> > META G1 >>> > CREATE G2 PGS/G1.1 >>> > DATA G2 >>> > META G1 >>> > CREATE G3 PGS/G1.2 >>> > DATA G3 >>> > META G1 >>> > CREATE G4 PGS/G1.3 >>> > DATA G4 >>> > META G1 >>> > CREATE G5 PGS/G1.4 >>> > DATA G5 >>> > META G1 >>> > >>> > Where PGS is GFID of .shards directory. >>> > >>> > Geo-rep will create these files independently in Slave Volume and >>> > syncs Xattrs of G1. Data can be read only when all the chunks are >>> > synced to Slave Volume. Data can be read partially if main/first >>> file >>> > and some of the chunks synced to Slave. >>> > >>> > Please add if I missed anything. C & S Welcome. >>> > >>> > regards >>> > Aravinda >>> > >>> > On 08/11/2015 04:36 PM, Aravinda wrote: >>> >> Hi, >>> >> >>> >> We are thinking different approaches to add support in >>> Geo-replication >>> >> for Sharded Gluster Volumes[1] >>> >> >>> >> *Approach 1: Geo-rep: Sync Full file* >>> >> - In Changelog only record main file details in the same brick >>> >> where it is created >>> >> - Record as DATA in Changelog whenever any addition/changes >>> to the >>> >> sharded file >>> >> - Geo-rep rsync will do checksum as a full file from mount and >>> >> syncs as new file >>> >> - Slave side sharding is managed by Slave Volume >>> >> *Approach 2: Geo-rep: Sync sharded file separately* >>> >> - Geo-rep rsync will do checksum for sharded files only >>> >> - Geo-rep syncs each sharded files independently as new files >>> >> - [UNKNOWN] Sync internal xattrs(file size and block count) >>> in the >>> >> main sharded file to Slave Volume to maintain the same state as >>> in Master. >>> >> - Sharding translator to allow file creation under .shards >>> dir for >>> >> gsyncd. that is Parent GFID is .shards directory >>> >> - If sharded files are modified during Geo-rep run may end up >>> stale >>> >> data in Slave. >>> >> - Files on Slave Volume may not be readable unless all sharded >>> >> files sync to Slave(Each bricks in Master independently sync >>> files to >>> >> slave) >>> >> >>> >> First approach looks more clean, but we have to analize the Rsync >>> >> checksum performance on big files(Sharded in backend, accessed >>> as one >>> >> big file from rsync) >>> >> >>> >> Let us know your thoughts. Thanks >>> >> >>> >> Ref: >>> >> [1] >>> >> >>> >>> http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator >>> >> -- >>> >> regards >>> >> Aravinda >>> >> >>> >> >>> >> _______________________________________________ >>> >> Gluster-devel mailing list >>> >> Gluster-devel@xxxxxxxxxxx >>> >> http://www.gluster.org/mailman/listinfo/gluster-devel >>> > >>> > >>> > >>> > _______________________________________________ >>> > Gluster-devel mailing list >>> > Gluster-devel@xxxxxxxxxxx >>> > http://www.gluster.org/mailman/listinfo/gluster-devel >>> > >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel@xxxxxxxxxxx >>> http://www.gluster.org/mailman/listinfo/gluster-devel >>> >>> > > regards > Aravinda > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel