Bob,
Thank you very much for reviewing
the draft and provided in-depth
comments. I am very sorry for the
delayed response due to traveling.
Replies to your comments are
inserted below marked by [Linda]:
Reviewer: Bob Briscoe
Review result: Not Ready
I have been selected as the
Transport Directorate reviewer for
this draft. The Transport Directorate
seeks to review all transport or
transport-related drafts as they pass
through IETF last call and IESG
review, and sometimes on special
request. The purpose of the review is
to provide assistance to the Transport
ADs. For more information about the
Transport Directorate Reviews and the
Transport Area Review Team, please see
https://trac..ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
In this case, very very few of the
review comments relate to transport
issues, although the greatest issue
concerns a desire that the network
could pause or stop connections during
L3 VM Mobility, which is certainly a
transport issue.
[Linda] There is “Hot Migration”
with transport service continuing, and
there is a “Cold Migration”, which is
a common practice in many data
centers, which stop the task running
on the old place and move to the new
place before restart as described in
the Task Migration.
Is it helpful to add this
description to the draft?
==Summary==
The technical aspects of the draft
concerning L2 VM mobility (within a
subnet) seem sound. However, this is
only part of the draft, which has the
following
issues:
#. The introduction does not say
what the purpose of publishing this
draft is.
It seems that, rather than
describing a specific protocol or
protocols, it intends to describe the
overall system procedure that would
typically be used in DCs for VM
mobility. It is tagged as a BCP, but
it does not say who needs this BCP,
why it is useful for the IETF to
publish this BCP, how wide the
authors' knowledge is of current
practice (given DCs are private), or
why this is a BCP rather than a
protocol spec.
[Linda] The first paragraph on Page
3 has the description why VM Mobility
is needed. Is it helpful to move this
paragraph to the beginning of the
Introduction Section?
“Virtualization
which is being used in almost all
of today’s data
centers enables
many virtual machines to run on a
single physical
computer or
compute server. Virtual machines
(VM) need hypervisor
running on the
physical compute server to provide
them shared
processor/memory/storage.
Network connectivity is provided
by the
network
virtualization edge (NVE)
[RFC8014]. Being able to move VMs
dynamically, or
live migration, from one server to
another allows for
dynamic load
balancing or work distribution and
thus it is a highly
desirable feature
[RFC7364].”
The draft starts out (S.3) as if it
intends to say what a good VM Mobility
protocol should or shouldn't do, but
the rest of the document doesn't give
any reasoning for these
recommendations, it just asserts what
appears to be one view of how a whole
VM Mobility system works, sometimes
referring to one example protocol RFC
for a component part, but more often
with no references or details.
[Linda] Is it helpful to move the
paragraph above to the beginning of
the Introduction Section? So that
audience is aware of why VM Mobility
is needed. And then follow up with
what a good VM Mobility protocol
should or shouldn't do?
#. It does not seem as if the NVO
WG has discussed the purpose of using
normative text in this draft. See
detailed comments.
[Linda] The “Intended status” of
the draft is “Best Current Practice”.
So all the text are not “normative”.
Is it Okay?
#. The draft silently slips back
and forth between VM mobility and VM
redundancy, without recognizing the
differences. See detailed comments.
[Linda] There is only one usage of
“redundancy” in the entire document,
used under the context of “Hot standby
option”, indicating the “redundancy”
of “the VMs in both primary and
secondary domains have identical
information and can provide services
simultaneously as in load-share mode
of operation” being expensive.
#. Please adopt different
terminology than "source NVE" and
"destination NVE", which are really
poor choices of terms for an
intermediate node. See detailed
comments. Why not use "old NVE" and
"new NVE", which is what you mean?
[Linda] Thanks for the suggestion.
We will change to “Old NVE”, and “new
NVE”.
#. Applicability is fairly clearly
outlined, but it is not clear whether
hosts corresponding with the mobile
VMs are part of the same controlled
environment or on the uncontrolled
public Internet. See detailed
comments.
[Linda] “Hosts” are the App running
on the VM. It is the under the same
controlled environment. Not on
uncontrolled public internet.
#. Section 4.2.1 on L3 VM mobility
reads like some potential
half-thought-through ideas on how to
solve L3 mobility, rather than current
practice, let alone best current
practice. Either current practice
should be described instead, or the
scope of the draft should be narrowed
solely to L2 VM mobility. See detailed
comments.
[Linda] This is refereeing to “Cold
Migration”, which is a common practice
in many data centers.
# The VM's file system is described
as state that moves with the VM (S.6),
but VM mobility solutions often move
the VM but stitch it back to its
(unmoved) storage. Conversely, the
storage can also move independent of
the VM.
[Linda] It depends. When a VM move
to a different zone, the storage/file
can becomes inaccessible.
#. The draft omits some of the
security, transport and management
aspects of VM mobility. See detailed
comments.
[Linda] Can you provide some text?
#. The draft reads as if different
sections have been written by
different authors and no-one has
edited the whole to give it a coherent
structure, or to ensure consistency
(both technical and editorial) between
the parts. See detailed comments.
[Linda] we can improve.
#. The quality of the English
grammar does not allow a reviewer to
concentrate on the technical aspects
rather than the English. It would have
been useful if one of the
English-speaking co-authors had
improved the English before submission
for review. See detailed comments.
[Linda] can you help? Becoming a
co-author to improve?
==Detailed Comments==
===#. Normative statements===
In the body of the document, there
is just one occurrence of normative
text (actually two "MUST"s, but both
state a common requirement - just
written separately for IPv4 and IPv6).
This merely serves to imply that
everything else the document says is
less important or optional, which was
probably not the intention.
[Linda] The goal is to indicate any
solution in moving the VM “MUST”
follow this rule. They make sense,
aren’t they?
At the start there is a
requirements section, which states
what a VM Mobility protocol "SHOULD"
or "SHOULD NOT" do. I think this is
intended as a set of goals for the
rest of the document. If so, these
"SHOULDs" are not intended to apply to
implementations, so they ought not to
be capitalized.
[Linda] okay, will change.
The first requirement, "Data center
network SHOULD support virtual machine
mobility in IPv6", is written as a
requirement on all DC networks, not on
implementations. I assume this was
intended to read as "Data center
network virtual machine mobility
protocols SHOULD support IPv6". Even
then, it doesn't really add anything
to say VM mobility should support v6
and it should support v4. A L2
solution won't. While undoubtedly, a
L3 solution will at least support one
of them.
[Linda]Agree. Will change it to
“Data center that support IPv6 address
should …”
I'm not sure that 'protocol' is the
right word anyway; I think 'VM
Mobility procedure' would be a better
phrase, because it includes steps such
as suspending the VM, which is more
than a protocol.
[Linda] yes. Will change to
“Procedure”.
The requirement "Virtual machine
mobility protocol MAY support host
routes to accomplish virtualization",
is not followed up at all in the rest
of the draft.
Even if this requirement stays, the
last 3 words should be deleted.
[Linda] will change to “Host Route
can be used to support the Virtual
Machine Mobility Procedure.”
By the end of the draft, the
solution falls far short of the most
relevant "Requirements" anyway, so one
assumes the title of the section ought
to have been "Goals". Specifically,
even in the simpler case of L2 VM
mobility, S.4.1 says that triangular
routing and tunnelling persist "until
a neighbour cache entry times out". A
cache timeout is about 10 orders of
magnitude longer than the requirement
to only persist "while handling
packets in flight", which would be a
few milliseconds at most (the time for
packets to clear the network that were
already launched into flight when the
old VM stopped).
Whatever, it would be preferable
for the draft to give rationale for
these requirements, rather than just
assert them. This would help to shed
light on the merits of the different
trade offs that solutions choose.
[Linda] Agree, will add.
===#. Mobility vs. Redundancy===
Redundancy and mobility have a lot
of similarities, but they have
different goals. With mobility, it is
necessary to know the exact instant
when one set of state is identical to
the other so it can hand over. With
redundancy, the aim is to keep two (or
more) sets of state evolving through
the same sequence of changes, but
there is no need to know the point at
which one is the same as the other was
at a certain point.
[Linda] Agree with what you said.
There is only one usage of
“redundancy” in the entire document,
used under the context of “Hot standby
option”, indicating the “redundancy”
of “the VMs in both primary and
secondary domains have identical
information and can provide services
simultaneously as in load-share mode
of operation” being expensive.
The draft slips from mobility to
resilience in the following places:
* S.2. Terminology: Warm VM
Mobility is defined without any
ending, as if it is permanent
replication. * S.7. "Handling of Hot,
Warm and Cold Virtual Machine
Mobility" is actually all about
redundancy, and doesn't address
mobility explicitly.
[Linda] Will add the definition
“Hot Migration”, “cold migration”, and
“warm migration”.
===#. Terminology===
Packets run from the source at A to
the destination at B via NVE1, then
via NVE2. Please don't call NVE1 and
NVE2 the source NVE and the
destination NVE.
In future, no-one will thank you
for the apparent contradictions when
they continually stumble over phrases
like this one in S.4.1: "...send their
packets to the source NVE"..
The term "packets in flight" is
used incorrectly to refer to all the
packets sent to the old NVE after the
VM has moved, even if they were
launched into flight long after the
old VM stopped receiving packets.
[Linda] thank for the comments.
Will change.
BTW, I think s/before/after/ in:
"that have old ARP or neighbor cache
entry before VM or task migration".
I think: s/IP-based VM mobility/L3
VM mobility/ throughout, because
"based"
sounds (to me) like the mobility
control protocol is over (i.e. based
on) IP.
===#. Applicability===
In section 4.2 it says that the
protocol mostly used as the IP based
task migration protocol is ILA. This
implies that all hosts corresponding
with the mobile VMs are either part of
the same controlled environment, or
they are proxied via nodes that are
part of the same controlled
environment (I only have passing
knowledge of ILA, but I understand
that it depends on ILA routers on the
path). If I am correct, this aspect of
scope needs to be made clear from the
start.
Also under the heading of
applicabiliy, the sentence "Since
migrations should be relatively rare
events" appears very late in the
document (S.4.2.1). The assumed level
of churn ought to be stated nearer the
start.
[Linda] yes, under the same
controlled environment.
===#. L3 Mobility===
L2 VM mobility is independent of
the application, because resolution of
L2 mappings is delegated to the stack.
In contrast, L3 VM mobility is only
feasible under certain conditions,
because an application needs an IP
address to open a socket (resolution
of DNS names is not delegated to the
stack, and apps can use IP addresses
directly anyway).
Examples of the 'certain
conditions':
a) /All/ applications used in the
whole DC load balancing scheme contain
IP address migration logic for /all/
their connections; b) VMs running
solely applications that support IP
address migration register this fact
with the NVA, and it only select such
VMs for mobility. c) An abstraction is
layered over /all/ the IP addresses
exposed to applications (at both ends)
so that the IP addresses that
applications use are solely
identifiers (e.g. ILA, LISP, HIP),
not also locators.
The introduction says the draft is
about VM mobility in a multi-tenant
DC, so the DC admin will not know the
range of applications being used. This
excludes condition (a) above. When the
draft says "...if all applications
running are known to handle this
gracefully...", it doesn't quantify
just how restrictive this condition
is, and it gives no explanation of how
this knowledge might be 'known' or
which function within the system
'knows' it.
S.4.2.1 contains what seems like
plenty of arm-waving.
* "TCP connections could be
automatically closed in the network
stack during a migration event."
o There is no TCP
connection state in the network stack.
o Even if the network
starts to drop every packet, the TCP
connection
state persists in the
end-points for a duration of the order
of 30-90
minutes (OS-dependent)
before TCP deems the connection is
broken. o
Other transport protocols
have similar designs (including the
app-layer
of protocols over UDP).
* "More involved approach to
connection migration":
o pausing the connection
[does this refer to an actual feature
of any
L4 protocol?] o packaging
connection state and sending to target
[does
this assume logic written
into the application, or is this
assuming the
stack handles this and the
app is restricted to using some form
of
separate identifier/locator
addresses?] o instantiating connection
state in the peer stack
[ditto?].
There's some arm-waving in S.7 too:
"Cold Virtual Machine mobility is
facilitated by the VM initially
sending an ARP or Neighbor
Discovery message at the destination
NVE
but the source NVE not receiving
any packets inflight."
[How is it arranged for the
source NVE not to receive any packets
in flight?]
And in S.7:
"In hot
standby option, regarding TCP
connections, one option is to start
with and maintain TCP
connections to two different VMs at
the same
time."
[This sounds like resilience
logic has been written into the
application,
which would be a special case
but not something VM mobility
infrastructure
could depend on.]
[Linda] will add.
===#. Gaps===
#. Security Considerations: repeats
issues in other drafts that are not
specific to mobility, but it does not
mention any security issues
specifically due to VM mobility. It
says that address spoofing may arise
in a DC (sort-of implying it is worse
than in non-DC environments, but not
saying why). The handshake at the
start of a connection (e.g. TCP, SCTP,
QUIC) checks for source address
spoofing. So L3 VM mobility would be
more vulnerable to source address
spoofing in cases where the mobile VM
was the connection initiator and there
was not a new handshake after the
move. However, this draft does not
contain any detailed mobility
protocols, so it is not possible to
identify any specific security flaws.
#. Transport Issues: Effect of
delay on the transport: Cold mobility
introduces significant delay, and
other forms less, but still some
delay. It should be pointed out that
some applications (e.g. real-time)
will therefore not be useful if
subjected to VM mobility. Similarly,
even a short period of delay will
drive most congestion controls to
severely reduce throughput. These
points might be self-evident, but
perhaps they should be stated
explicitly.
BTW, in the L3 VM mobility case,
the draft often refers to TCP
connections, but the address bindings
of any transport protocols would have
to be migrated due to VM mobility
(e.g. SCTP; sequences of datagrams
over UDP; streams over UDP such as
with RTP, QUIC).
#. Management Issues: perhaps the
draft ought to recommend statistics
gathering (e.g. time taken, amount of
duplicate data) to aid a DC's future
decisions on the cost-benefit of
moving a VM. The OPSDIR review says a
BCP does not /have/ to describe
management issues, but this document
seems to describe a whole system
procedure, not just a protocol, which
then surely includes the management
plane.
[Linda] can you become a co-author
and add those in?
===#. Incoherent Structure===
S.4.1. happens to talk about VMs
moving, while S.4.2. happens to talk
about tasks moving, but this is not
the distinguishing aspect of these two
sections (anyway, S.2. says "the draft
uses task and VM interchangeably"): *
"4.1 VM Migration" is about "L2 VM
Mobility" so this ought to be the
section heading, *
"4.2 Task Migration" is about "L3
VM Mobility" so this ought to be the
section heading. It would also help
not to switch from VM to task across
these sections
- it's just a distraction.
S.4.1 needs better signposting of
where each sub-case ends (Subsections
might be useful to solve this): * IPv4
* end-user client * 2 paras starting
"All NVEs communicating with this
virtual machine..." [Not clear that
the end-user case has ended and we
have returned to the general IPv4
case?] * IPv6 [Strictly, it still
hasn't said whether the end-user
client case has ended.] [Also, it
doesn't explain why there is no need
for an end-user client case under
IPv6?] Sections 5 & 6 seem to be
about either L2 or L3 mobility,
whereas Sections 7 &
8 seem to be restricted to L2.
The draft vacillates over what to
do with packets arriving at the old
NVE in the L3 case (see also L3
mobility above): * S4.2 first says
packets are dropped, possibly with an
ICMP error message;
o then later it says they are
silently dropped;
o then in the very next sentence
it says either silently drop them or
forward
them to the new location
* S.5 says they should not be lost,
but instead delivered to the
destination hypervisor
o then it describes how they are
tunnelled (which is not the same as
"forwarding").
The order in which all the stages
of mobilty are given is jumbled up
across sections that also appear in
arbitrary order: * S.5 prepares,
establishes uses then stops a tunnel,
but it doesn't say where the other
stages fit between these steps
o When tunneling packets,
it talks about the *migrating* VM not
the
*migrated* VM, which
implies tunnelling has started before
the new VM
is running. Does this imply
there is a huge buffer? o It says
"Stop
Tunneling Packets - When
source NVE stops receiving packets
destined
to..." but it is never
clear when a source has stopped
sending packets
to a destination, unless it
explicitly closes the connection (e.g.
with
a FIN in the case of TCP).
Often there are long gaps between
packets,
because many flows are
'thin' (meaning the application
frequently has
nothing to send). These
gaps can last for milliseconds, hours
or even
days without any
implication that the connection has
ended.
* Then S.6. describes moving state,
but doesn't say that this is not after
the previous tunnelling steps (or
where it fits within those steps). *
Then S.7 describes hot, warm and cold
mobility, but doesn't lay out the
tunnelling or steps to move state in
each case. * Then S.8 says it's about
VM life-cycle, but just gives the very
first 3 steps for allocation of
resources to a VM, then abruptly ends,
without even starting the VM, let
alone getting to move it.
S.5 exhibits another inconsistency
by talking about the hypervisor, not
the NVE.
==#. Nits==
Nits with the English are too
numerous to mention them all. Below
are pointers to general problems as
well as some individual instances.
S.4
"Layer 2 and Layer 3 protocols
are described next. In the following
sections, we examine more
advanced features."
s/following/subsequent/
S.4.1
Expand WSC, MSC and NVA on first
use.
s/the VM moves in the same link/the
VM moves in the same subnet/
"i.e. end-user clients ask for the
same MAC address upon migration. [...]
to ensure that the same IPv4 address
is assigned to the VM." I think
s/IPv4/MAC/ was intended?
" All NVEs communicating with this
virtual machine uses the old ARP
entry. If any VM in those NVEs
need to talk to the new VM in the
destination NVE, it uses the old
ARP entry."
Repetition: these 2 sentences say
the same. (The mistake is also
repeated when these 2 sentences are
repeated for IPv6).
S.4.2.1
s/Push the new mapping to
hosts./Push the new mapping to
communicating hosts./
S.5.
The IPv4/IPv6 pairs of paras for
"tunnel estabilshment" and "tunneling
packets"
only differ in the words
"IPv4"/"IPv6". So in each case a
single para could be given for IP
(irrespective of whether v4 or v6).
Thank you very much.
Linda Dunbar