Re: Client side AFR race conditions?

Kevan Benson <kbenson@xxxxxxxxxxxxxxx> · Tue, 06 May 2008 14:47:34 -0700

Derek Price wrote:
Kevan Benson wrote:

I'm not saying I don't want to see a more robust solution for client 
side AFR, just that each configuration has it's place, and client side 
AFR isn't currently (and may never be) capable of serving a share that 
requires high data integrity.

If you think fixing this current issue will solve your problems, maybe 
you haven't considered the implications of connectivity problems 
between some clients and some (not all) servers...  Add in some 
clients with slightly off timestamps and you might have some major 
problems WITHOUT any reboots.

Am I getting this straight?  Even with server-side AFR, you get mirrors, 
but if all the clients aren't talking to the same server then there is 
no forced synchronization going on?  How hard would it be to implement 
some sort of synchronization/locking layer over AFR such that reads and 
writes could still go to the nearest (read: fastest) possible server yet 
still be guaranteed to be in sync?

Server side AFR should be susceptible to the same problems as client 
side AFR in clients can use arbitrary servers.  E.g. Client A writes to 
Server A for file X at the same time Client B writes to Server B for 
file X.  Server A and B are essentially "clients" to the AFR, so the 
same race condition should exist.  Possibly even exacerbated due to the 
speed difference in local verses remote AFR sub-volumes.

In other words, the majority of servers would know of new version 
numbers being written anywhere and yet reads would always serve local 
copies (potentially after waiting for synchronization).  The application 
I'm thinking of is virtualized read/write storage.  For example, say you 
want to share some sort of data repository with offices in Europe, 
India, and the U.S. and you only have slow links connecting the various 
offices.  You would want all client access to happen against a local 
mirror, and you would want to restrict traffic between the mirrors to 
that absolutely required for locking and data synchronization.

The only thing I'm not quite sure of in this model is what to do if the 
server processing a write operation crashes before the write finishes. I 
wouldn't want reads against the other mirrors to have to wait 
indefinitely for the crashed server to return, so the best I can come up 
with is that "write locks" for any files that hadn't been mirrored to at 
least one available server before a crash would need to be revoked on 
the first subsequent attempted access of the unsynchronized file.  Then 
when the crashed server came back up and tried to synchronize, it would 
find that its file wasn't the current version and sync in the other 
direction.

I would think a specialized translator would work great for this. 
Something optimized for the server, where it intercepts writes and 
creates binary diffs for syncing instead of copying the whole file.  In 
essence, trade computing power for bandwidth.  That doesn't help right 
now though, and it doesn't address locking.

The only way I see to ensure data integrity is to have some arbiter vet 
all writes.  You can try to make that arbiter redundant, but good luck 
making it actually distributed.

--

-Kevan Benson
-A-1 Networks