Re: Read from fastest node only

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Thu, Jul 29, 2021 at 3:16 PM David Cunningham <dcunningham@xxxxxxxxxxxxx> wrote:
Hello,

Thanks for all the replies. I'll try to address each point:

1. "First readable child... Isn't this the first brick in the subvolume"
Does that mean the first brick in the list returned by "gluster volume status"?

2. "I think that you can play a little bit with md-cache"
Unfortunately I don't have access to the RH article. We would be happy to skip the lookups when reading a file, as long as some data is returned and the file read doesn't block or fail. Do you think that md-cache can accomplish this? If so we might invest more in researching it.

3. "In real life, the 'best' node is the one with the highest overall free resources, across CPU, network and disk IO."
My question relates to a slightly different real life, where some of the nodes are within a very short latency (eg 0.15ms) and some others are further away (eg 15ms). What we want to avoid is the 15ms delay to check the nodes further away. The CPU, disk IO, etc load on each server is going to be insignificant compared to the difference in the latency between nodes.

4. "Our latency check is indeed not per file, AFAIK."
If GlusterFS checks the health of the file on each read then I guess the latency to all nodes will be a factor on each read. This is what we're looking for a way to avoid.

5. "I think you mean cluster.choose-local which is enabled by default. Yet, Gluster will check if the local copy is healthy."
What is the local copy exactly? I'm talking from the point of view of a machine which is running the GlusterFS FUSE client, and is not a GlusterFS node.

Local copy means the fuse client is mounted on the same node as that of the brick and the file in question happens to reside on that brick.

First readable child simply means the first healthy brick (i.e no pending heals) as you iterate through a 'for loop' of all bricks beginning with brick-0. So if you set the read-hash-mode to 0, and if all bricks were healthy, the reads will always be served from brick-0.

> 4 = brick having the least network ping latency.
Like Yaniv said, the latency check is *not* on a per file. When the client attempts to connect to the bricks at the time of mount, the ping time of all bricks are captured and the one with the lowest value is used thereafter.

If you have already identified that a particular brick has the lowest latency w.r.t communicating with the client, you can also use that brick using the `read-subvolume-index` option. But bear in mind all these policies are global and will affect all clients. Also, AFR based replication is synchronous and not an eventual consistency model. So your original requirement of older reads won't work - i.e. read will never be served from a brick that needs heal even if it is the fastest.

You can look at afr_read_subvol_select_by_policy() in the code if you want to get a glimpse of the internals.

Hope that helps,
Ravi

 

Thanks again for the input.


On Wed, 28 Jul 2021 at 23:36, Gionatan Danti <g.danti@xxxxxxxxxx> wrote:
Il 2021-07-28 13:11 Strahil Nikolov ha scritto:
> I think you mean cluster.choose-local which is enabled by default.
> Yet, Gluster will check if the local copy is healthy.

Ah, ok, from reading here [1] I was under the impression that
cluster.choose-local was somewhat deprecated.
Good to know that it is here to stay!
Regards.

[1]
https://lists.gluster.org/pipermail/gluster-users/2015-June/022288.html


--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8


--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux