On 09/03/2015 12:13 PM, Krutika
Dhananjay wrote:
From:
"Shyam" <srangana@xxxxxxxxxx>
To: "Krutika Dhananjay" <kdhananj@xxxxxxxxxx>
Cc: "Aravinda" <avishwan@xxxxxxxxxx>, "Gluster
Devel" <gluster-devel@xxxxxxxxxxx>
Sent: Wednesday, September 2, 2015 11:13:55 PM
Subject: Re: Gluster Sharding and
Geo-replication
On 09/02/2015 10:47 AM, Krutika Dhananjay wrote:
>
>
>
------------------------------------------------------------------------
>
> *From: *"Shyam" <srangana@xxxxxxxxxx>
> *To: *"Aravinda" <avishwan@xxxxxxxxxx>,
"Gluster Devel"
> <gluster-devel@xxxxxxxxxxx>
> *Sent: *Wednesday, September 2, 2015 8:09:55 PM
> *Subject: *Re: Gluster Sharding and
Geo-replication
>
> On 09/02/2015 03:12 AM, Aravinda wrote:
> > Geo-replication and Sharding Team today
discussed about the approach
> > to make Sharding aware Geo-replication. Details
are as below
> >
> > Participants: Aravinda, Kotresh, Krutika, Rahul
Hinduja, Vijay Bellur
> >
> > - Both Master and Slave Volumes should be
Sharded Volumes with same
> > configurations.
>
> If I am not mistaken, geo-rep supports replicating to
a non-gluster
> local FS at the slave end. Is this correct? If so,
would this
> limitation
> not make that problematic?
>
> When you state *same configuration*, I assume you
mean the sharding
> configuration, not the volume graph, right?
>
> That is correct. The only requirement is for the slave to
have shard
> translator (for, someone needs to present aggregated view
of the file to
> the READers on the slave).
> Also the shard-block-size needs to be kept same between
master and
> slave. Rest of the configuration (like the number of
subvols of DHT/AFR)
> can vary across master and slave.
Do we need to have the sharded block size the same? As I
assume the file
carries an xattr that contains the size it is sharded with
(trusted.glusterfs.shard.block-size), so if this is synced
across, it
would do. If this is true, what it would mean is that "a
sharded volume
needs a shard supported slave to ge-rep to".
Yep. Even I feel it should probably not be necessary to
enforce same-shard-size-everywhere as long as shard translator
on the slave takes care not to further "shard" the individual
shards gsyncD would write to, on the slave volume.
This is especially true if different files/images/vdisks on
the master volume are associated with different block sizes.
This logic has to be built into the shard translator based
on parameters (client-pid, parent directory of the file being
written to).
What this means is that shard-block-size attribute on the
slave would essentially be a don't-care parameter. I need to
give all this some more thought though.
I think this may help with coping with changes in shard block size
configuration in master. Otherwise, once user changes the shard
block size in master, the slave will be affected.
Are there any other shard volume options that if changed on master,
would affect slave? How do we ensure master and slave are in sync
w.r.t the shard configuration?
-Krutika
>
> -Krutika
>
>
>
> > - In Changelog record changes related to
Sharded files also. Just
> like
> > any regular files.
> > - Sharding should allow Geo-rep to
list/read/write Sharding internal
> > Xattrs if Client PID is gsyncd(-1)
> > - Sharding should allow read/write of Sharded
files(that is in
> .shards
> > directory) if Client PID is GSYNCD
> > - Sharding should return actual file instead of
returning the
> > aggregated content when the Main file is
requested(Client PID
> > GSYNCD)
> >
> > For example, a file f1 is created with GFID G1.
> >
> > When the file grows it gets sharded into
chunks(say 5 chunks).
> >
> > f1 G1
> > .shards/G1.1 G2
> > .shards/G1.2 G3
> > .shards/G1.3 G4
> > .shards/G1.4 G5
> >
> > In Changelog, this is recorded as 5 different
files as below
> >
> > CREATE G1 f1
> > DATA G1
> > META G1
> > CREATE G2 PGS/G1.1
> > DATA G2
> > META G1
> > CREATE G3 PGS/G1.2
> > DATA G3
> > META G1
> > CREATE G4 PGS/G1.3
> > DATA G4
> > META G1
> > CREATE G5 PGS/G1.4
> > DATA G5
> > META G1
> >
> > Where PGS is GFID of .shards directory.
> >
> > Geo-rep will create these files independently
in Slave Volume and
> > syncs Xattrs of G1. Data can be read only when
all the chunks are
> > synced to Slave Volume. Data can be read
partially if main/first file
> > and some of the chunks synced to Slave.
> >
> > Please add if I missed anything. C & S
Welcome.
> >
> > regards
> > Aravinda
> >
> > On 08/11/2015 04:36 PM, Aravinda wrote:
> >> Hi,
> >>
> >> We are thinking different approaches to add
support in
> Geo-replication
> >> for Sharded Gluster Volumes[1]
> >>
> >> *Approach 1: Geo-rep: Sync Full file*
> >> - In Changelog only record main file
details in the same brick
> >> where it is created
> >> - Record as DATA in Changelog whenever
any addition/changes
> to the
> >> sharded file
> >> - Geo-rep rsync will do checksum as a
full file from mount and
> >> syncs as new file
> >> - Slave side sharding is managed by
Slave Volume
> >> *Approach 2: Geo-rep: Sync sharded file
separately*
> >> - Geo-rep rsync will do checksum for
sharded files only
> >> - Geo-rep syncs each sharded files
independently as new files
> >> - [UNKNOWN] Sync internal xattrs(file
size and block count)
> in the
> >> main sharded file to Slave Volume to
maintain the same state as
> in Master.
> >> - Sharding translator to allow file
creation under .shards
> dir for
> >> gsyncd. that is Parent GFID is .shards
directory
> >> - If sharded files are modified during
Geo-rep run may end up
> stale
> >> data in Slave.
> >> - Files on Slave Volume may not be
readable unless all sharded
> >> files sync to Slave(Each bricks in Master
independently sync
> files to
> >> slave)
> >>
> >> First approach looks more clean, but we
have to analize the Rsync
> >> checksum performance on big files(Sharded
in backend, accessed
> as one
> >> big file from rsync)
> >>
> >> Let us know your thoughts. Thanks
> >>
> >> Ref:
> >> [1]
> >>
>
http://www.gluster.org/community/documentation/index.php/Features/sharding-xlator
> >> --
> >> regards
> >> Aravinda
> >>
> >>
> >>
_______________________________________________
> >> Gluster-devel mailing list
> >> Gluster-devel@xxxxxxxxxxx
> >>
http://www.gluster.org/mailman/listinfo/gluster-devel
> >
> >
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel@xxxxxxxxxxx
> >
http://www.gluster.org/mailman/listinfo/gluster-devel
> >
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel
|