Re: reading from local replica?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/8/2015 5:55 PM, Brian Ericson wrote:
Am I misunderstanding cluster.read-subvolume/cluster.read-subvolume-index?

I have two regions, "A" and "B" with servers "a" and "b" in, respectfully, each region. I have clients in both regions. Intra-region communication is fast, but the pipe between the regions is terrible. I'd like to minimize inter-region communication to as close to glusterfs write operations only and have reads go to the server in the region the client is running in.

I have created a replica volume as:
gluster volume create gv0 replica 2 a:/data/brick1/gv0 b:/data/brick1/gv0 force

As a baseline, if I use scp to copy from the brick directly, I get -- for a 100M file -- times of about 6s if the client scps from the server in the same region and anywhere from 3 to 5 minutes if I the client scps the server in the other region.

I was under the impression (from something I read but can't now find) that glusterfs automatically picks the fastest replica, but that has not been my experience; glusterfs seems to generally prefer the server in the other region over the "local" one, with times usually in excess of 4 minutes.

I've also tried having clients mount the volume using the "xlator" options cluster.read-subvolume and cluster.read-subvolume-index, but neither seem to have any impact. Here are sample mount commands to show what I'm attempting:

mount -t glusterfs -o xlator-option=cluster.read-subvolume=gv0-client-<0 or 1> a:/gv0 /mnt/glusterfs mount -t glusterfs -o xlator-option=cluster.read-subvolume-index=<0 or 1> a:/gv0 /mnt/glusterfs

Am I misunderstanding how glusterfs works, particularly when trying to "read locally"? Is it possible to configure glusterfs to use a local replica (or the "fastest replica") for reads?
I am not a developer, nor intimately familiar with the insides of glusterfs, but here is how I understand that glusterfs-fuse file reads work. First, all replica bricks are read, to make sure they are consistent. (If not, gluster tries to make them consistent before proceeding). After consistency is established, then the actual read occurs from the brick with the shortest response time. I don't know when or how the response time is measured, but it seems to work for most people most of the time. (If the client is on one of the brick hosts, it will almost always read from the local brick.)

If the file reads involve a lot of small files, the consistency check may be what is killing your response times, rather than the read of the file itself. Over a fast LAN, the consistency checks can take many times the actual read time of the file.

Hopefully others will chime in with more information, but if you can supply more information about what you are reading, that will help too. Are you reading entire files, or just reading in a lot of "snippets" or what?

Ted Miller
Elkhart, IN, USA
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux