Hi. I have some comments between quotes below.
On 08/24/2017 01:18 AM, Nivin Lawrence wrote:
...
The system can be considered as one with dual CPU complex, but a single
SSD. There are 2 operating systems, OS-1 and OS-2 running on each of the
CPU complex, OS-1 boots first and takes control of the SSD and starts
the target. OS-2 which boots up later uses initiator to connect to
target over an interface connecting the 2 CPU complex. There is no
switch in between, its a back to back Ethernet network over which AoE is
being used by OS-2 for all disk accesses.
Now, at some point in time, OS-1 has to be shut down, which means target
is not available anymore. OS-2 will now start the target on loopback
interface using "vbladed 0 0 lo /dev/sda".
In general, when you have two different hosts using a shared read-write
resource like this, you have to prevent a split-brain situation by using
fencing. I think that if you're sure that OS-1 only ever uses the SSD
at the request of OS-2 and you have fully shut down the whole path from
the user on OS-2 through the storage layers on OS-1 before OS-2 begins
using the SSD directly, it's OK, but you'd have to use two-phase commit
or something to be sure it's always going to be 100% safe. (Otherwise
you run the risk of one of them booting during a communication failure
and using the SSD inappropriately while the other is using it.)
More below.
As this system is going to
run for a long time, i want to understand the overhead that is there in
disk access from the POV of applications running on OS-2. I want to
compare the AoE based access mechanism in the above scenario with the
case where OS-2 applications are directly accessing the SSD.
Now, all the queries below are for this case where OS-2 has the
initiator which tries to connect to the target which is also running in
OS-2, no network in between. Not sure if your previous responses would
still be valid for this specific case as for example MTU might not be a
consideration from the actual data transfer POV.
a.__Wanted to understand if the AoE model has specific optimizations
done to make this as close as possible to a regular local disk access.
There are a lot of decisions that were made with performance in mind,
but the loopback-only network scenario was not ever the primary concern
as a deployment environment. You should try it and see, though.
Performance is always best measured and analyzed. Sometimes there will
be one or two bottlenecks that are easy to identify and eliminate.
The Linux kernel already has optimizations for loopback networking.
I think you might do more optimizations for your situation if you think
carefully about I/O operations, their sizes and offsets. You'll find
some discussions of that periodically here in the list archives.
--
Ed
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Aoetools-discuss mailing list
Aoetools-discuss@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/aoetools-discuss