Re: AOE client behavior on server restart

Ed Cashin <ed.cashin@xxxxxxx> · Tue, 28 Mar 2017 20:36:49 -0400

Hi, Nivin Lawrence.  Answers appear between selected quotes below.

On 03/27/2017 11:12 AM, Nivin Lawrence wrote:
> Hi experts,
>     I am trying at understanding the implications of any network drops
> and AOE/vblade server restart.
>
> 1) Is it expected to not see any loss in the Ethernet network where the
> client/server resides or is the protocol capable of handling drops in
> the network?

For storage, the "target" is the place where the data is persisted, and 
the "initiator" is the software that wants to read and write the data. 
I think you mean initiator by client, but I'll keep using the storage 
oriented words, because it matches the documentation.

No, Ethernet is expected to drop packets, and the initiator is expected 
to retransmit commands.  The tag field in the AoE header identifies a 
command, and the initiator gets to pick it.  The target copies that tag 
into the response to the command, so that the initiator can match the 
response to the waiting request.

> 2) If the server restarts and remains unavailable for say 5 seconds,
> what happens to the clients?

That's up to the initiator.  For example, the aoe driver in Linux has a 
module parameter, aoe_deadsecs, that controls this behavior.  If it's 
set to 4, then it will fail all the AoE commands (reads and writes) that 
have been waiting for a response for 4 seconds.

But the default is higher: three minutes worth of seconds.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/block/aoe/aoecmd.c#n29

...
>      - Would be good to understand if there is an option where client
> wont need to do a mount again once the server becomes available and if
> the transactions could continue just as if there was a network
> connectivity issue for 5 seconds.

 From the initiator's perspective, that's already the case: Some people 
plug the initiator and target into a new networks switch without even 
unmounting.  That's not recommended, but it has happened a lot.

Now there's also the issue of the target.  Nothing I've said addresses 
the possibility of write caching.  Without write caching there is no 
problem with the target power cycling or something, you mentioned the 
server restarting.  If the server is running a vblade using regular I/O, 
you can't just restart that system.  You'd have to make sure that 
incoming AoE commands stop, all write operations finish, buffers are 
synced from RAM to disk, and *then* the server restarts.

When AoE resumes, some of the commands can be performed again, the ones 
that were performed without the response reaching the initiator. 
Usually that will mean the same data gets written to the same place as 
before.

There are ways to avoid write caching on the storage target, but they 
often entail a significant performance hit.  There are ways to make the 
whole write cache path keep the data safe in case of power interruption, 
but the ones I'm aware of use special hardware.

-- 
   Ed

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Aoetools-discuss mailing list
Aoetools-discuss@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/aoetools-discuss