Re: Latency analysis of GlusterFS' network layer for pgbench

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Mon, Dec 24, 2018 at 6:05 PM Raghavendra Gowdappa <rgowdapp@xxxxxxxxxx> wrote:


On Mon, Dec 24, 2018 at 3:40 PM Sankarshan Mukhopadhyay <sankarshan.mukhopadhyay@xxxxxxxxx> wrote:
[pulling the conclusions up to enable better in-line]

> Conclusions:
>
> We should never have a volume with caching-related xlators disabled. The price we pay for it is too high. We need to make them work consistently and aggressively to avoid as many requests as we can.

Are there current issues in terms of behavior which are known/observed
when these are enabled?

We did have issues with pgbench in past. But they've have been fixed. Please refer to bz [1] for details. On 5.1, it runs successfully with all caching related xlators enabled. Having said that the only performance xlators which gave improved performance were open-behind and write-behind [2] (write-behind had some issues, which will be fixed by [3] and we'll have to measure performance again with fix to [3]). For some reason, read-side caching didn't improve transactions per second.

One possible reason for read-caching in glusterfs didn't show increased performance can be, VFS already supports read-ahead (of 128KB) and page-cache. It could be that whatever performance boost that can be provided with caching is already leveraged at VFS page-cache  itself and hence making glusterfs caching redundant. I'll run some tests to gather evidence to (dis)prove this hypothesis.

I am working on this problem currently. Note that these bugs measure transaction phase of pgbench, but what xavi measured in his mail is init phase. Nevertheless, evaluation of read caching (metadata/data) will still be relevant for init phase too.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1512691
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1629589#c4
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1648781


> We need to analyze client/server xlators deeper to see if we can avoid some delays. However optimizing something that is already at the microsecond level can be very hard.

That is true - are there any significant gains which can be accrued by
putting efforts here or, should this be a lower priority?

The problem identified by xavi is also the one we (Manoj, Krutika, me and Milind) had encountered in the past [4]. The solution we used was to have multiple rpc connections between single brick and client. The solution indeed fixed the bottleneck. So, there is definitely work involved here - either to fix the single connection model or go with multiple connection model. Its preferred to improve single connection and resort to multiple connections only if bottlenecks in single connection are not fixable. Personally I think this is high priority along with having appropriate client side caching.

[4] https://bugzilla.redhat.com/show_bug.cgi?id=1467614#c52


> We need to determine what causes the fluctuations in brick side and avoid them.
> This scenario is very similar to a smallfile/metadata workload, so this is probably one important cause of its bad performance.

What kind of instrumentation is required to enable the determination?

On Fri, Dec 21, 2018 at 1:48 PM Xavi Hernandez <xhernandez@xxxxxxxxxx> wrote:
>
> Hi,
>
> I've done some tracing of the latency that network layer introduces in gluster. I've made the analysis as part of the pgbench performance issue (in particulat the initialization and scaling phase), so I decided to look at READV for this particular workload, but I think the results can be extrapolated to other operations that also have small latency (cached data from FS for example).
>
> Note that measuring latencies introduces some latency. It consists in a call to clock_get_time() for each probe point, so the real latency will be a bit lower, but still proportional to these numbers.
>

[snip]
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux