Configuration suggestions (aka poor/slow performance on new hardware)

raghavendra at gluster.com (Raghavendra G) · Thu, 1 Apr 2010 07:48:25 +0400

On Wed, Mar 31, 2010 at 1:40 PM, Ed W <lists at wildgooses.com> wrote:

> On 31/03/2010 06:14, Tom Lanyon wrote:
>
>> On 31/03/2010, at 2:36 PM, Raghavendra G wrote:
>>
>>
>>
>>> Current design of write-behind acknowledges writes (to applications) even
>>> when they've not hit the disk. Can you please explain how this design is
>>> different (if it is different) from the idea you've explained above?
>>>
>>>
>> Is this gluster method of write-behind acknowledging the writes before
>> they've left the client? The method Ed was describing is that the write is
>> acknowledged only once its reached the server (and a defined number of
>> replication targets), even though it hasn't hasn't been written to disk on
>> the server yet. This is a hybrid approach which safeguards against client
>> power failure before the write (which has already been acknowledged) gets
>> pushed to any servers, but improves performance over end-to-end
>> write-through as it does not wait for the write acknowledgement from the
>> physical disk(s).
>>
>>
>>
>
>
> Agreed.  So assuming say one client talking over network to a 100 server
> replicas (absurd for the purposes of clarification)
>
> Our safety levels are:
>
> 1) ACK sent as soon as app sends data to the client OS and before it's even
> left the client machine. Complete data loss possible if the client is
> unplugged/dies at that instance. (weak / fast)
>

This functionality can be achieved in glusterfs by loading write-behind
translator on client.

>
> 2) ACK sent only once data sent to all 100 replicas AND data written to
> disk. Data loss only possible if all replicas are lost. (strong / slowest)
>

write-behind is not needed in this case.

>
> 3) ACK sent once X server machines have received the request (to ram).
>  Data loss possible if all server machines lost before they write the
> request to disk. Good compromise of speed vs reliability guarantees
>

This functionality can be achieved by loading write-behind translator on
server-side (on top of posix translator).

>
>
> In the simplest situation of a single server then we have roughly achieved
> the effect of moving the writeback cache to the serverside.  In the case of
> multiple servers with exactly equal latency to the client then we have
> roughly achieved the same as moving writeback cache to serverside on all
> servers.  In the case of non equal latency between client and server, or
> with server side replication, or with very busy servers then we gain a
> performance improvement due to the lower latency before the ACK sent to the
> client
>
> I thought this was a very clever technique and actually very compatible
> with the gluster philosophy (independent bricks)
>
> Ed W
>

regards,
-- 
Raghavendra G