Re: Unfair scheduling in unify/AFR

Székelyi Szabolcs <cc@xxxxxxxxx> · Wed, 21 Nov 2007 19:51:51 +0100

Anand Avati wrote:
> 
> 
>     I'm using io-threads on client side. At this time, only one client
>     accesses a storage brick at a time (on a given server), so I thought
>     io-threads won't help there. But on client side, waiting for a read for
>     one thread shouldn't block the whole client (because there can be more
>     threads), so I loaded io-threads on the client.
> 
> 
> The client does not block for any operation and is completely
> asynchronous. The synchronous parts in glusterfs are network writes in
> tcp (we are working on a non-blocking socket model design) and posix
> (linux-aio translator is on the way). So io-threads would help for reads
> on the server side alone since tcp writes of read requests is not really
> heavy.

So am I right when I think that io-threads would help only if loaded on
the server side, just above the storage/posix brick and more clients
access the server in parallel?

But from the application's point of view, reads are synchronous in the
sense that a read() blocks until the corresponding block is fetched from
the server. And this is why we can't utilize the full bandwidth, but
this is outside the scope of GlusterFS, this is a kernel "feature". We
cannot (and do not) blame GLusterFS for this. Just thinking. Am I right?
(I try to find a reason for GlusterFS not being able to saturate the
link.) Is this true? If not, why do I get a performance of 540 MB/s not
900 or 1000? Interconnect speed is far above this, see below.

>     > When you copy, are the source and destination files both on the
>     glusterfs
>     > mount point?
> 
>     No, since we are testing pure read performance, and only GlusterFS
>     performance. So I copy sparse files over GlsuterFS into /dev/null with
>     `dd bs=1M`.
> 
>     Currently, a single and only thread has a read performance about
>     540-580
>     MB/s. What I would like to see is two threads reading two files from two
>     servers, with a performance of at least 540 MB/s *each*.
> 
> 
> Are these two dd's from the same machine? If so I suspect that fuse is
> becoming a bottleneck. What is your interconnect? 10Gig/E ? 

Yes, they are from the same host, so they use one GlusterFS client. My
interconnect is 20 Gbps ib-verbs (of which we can utilize about 11 Gbps,
due to architectural limitations, I guess).

> You might be
> interested in the new booster module which is on the way which will
> short cut the IO path directly to the server from the application's
> address space (via LD_PRELOAD). You might also want to try the next few
> TLA commits which are on the way, which will help parallel file reads
> going via fuse.

I have tried the booster module, it does the same. Please let me know if
there are new commits that may help.

Thanks in advance,
--
cc