Re: Can I bring a development idea to Dev's attention?

Gordan Bobic <gordan@xxxxxxxxxx> · Fri, 24 Sep 2010 09:42:45 +0100

This sounds remarkably similar to how DLM in GFS works. It caches file 
locks, so performance is reasonable where a set of files is only 
accessed from one of the nodes. Might it be easier to interface with DLM 
for locking control instead of implementing such a thing from scratch?

The bulk of the performance hit comes from ping times, rather than 
bandwidth issues / writeback caching. Latencies on a LAN are typically 
100us on Gb ethernet, vs. a typical RAM latency of 50ns, so call it a 
2000x difference. If this overhead on file lock acquisition can be 
avoided, it'll make a lot more difference than data caching.

Gordan

Ed W wrote:
  On 24/09/2010 05:10, Craig Carl wrote:
Ed -
   If I understand it looks like you are recommending a method for 
implementing an asynchronous replication solution as a possible 
alternative to the current synchronous replication method?

I think there are two main use cases which benefit:

1) Master/Master, epecially where the client is itself one of the 
bricks.  Eg recently there have been several threads on poor performance 
using gluster as the backing store for a web server.  Here a common 
situation might be that we have a backing store holding say two web 
applications, each frontend server generally only serves one of the two 
applications and so we want to avoid network accesses in the common case 
that files typically have affinity for being used by just one of the 
servers.

2) Achieving effectively the benefit of a large writeback cache, yet 
without compromising coherency, in the face of larger RTT times between 
bricks.  This could be anything from a 100mbit IP link between heavily 
accessed servers, to a WAN. 

Optimistic locking is basically a way of optimising for the case where a 
single brick at a time tends to access a subset of files.  It does 
absolutely nothing for the situation that you have more than one brick 
competing for access to the same file (I think it's obvious that the 
latter situation is hard to improve anyway)

So really optimistic locking is a performance improvement in any 
situation where:

- One server accesses a file more than once in a row, before any other 
server requests access (doesn't matter whether its a read or write)
- The above also implies that we will get maximum benefit in the case 
where there is relatively large RTT times between servers (this would 
include even gigabit for the case of a heavily used server though)
- We can also infer that this optimisation benefits us most if we can 
tweak our applications to have some kind of affinity to prefer a given 
server for a given subset of files (often this is very easily done for a 
whole class of applications, eg webservers point A records to specific 
servers, mailservers trivially route users to their preferred storage 
server, geographic clustering tends to take care of itself if the client 
isn't in a rocket ship, etc)

OK, so that's "optimistic locking" and the reason why it would be nice 
to have it.  Traditionally this is done using a shared lock server 
(single point of failure).  However, my suggestion was to read up on the 
algorithms in the publications list, which show how it's possible to 
implement a fault tolerant, shared nothing, lock server (cool!).  Now we 
have a lock server in the style of gluster, where there is no single 
point of failure!

So I think really it's two feature requests:

1) Can you please implement optimistic locking optimisations using a 
lock server
2) Can you please make the lock server fault tolerant, fully 
distributed, shared nothing, eg using a Paxos derivative

Cheers

Ed W

------------------------------------------------------------------------

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel