Bob,
Thank you very much for reviewing the draft
and provided in-depth comments. I am very sorry
for the delayed response due to traveling.
Replies to your comments are inserted below
marked by [Linda]:
Reviewer: Bob Briscoe
Review result: Not Ready
I have been selected as the Transport
Directorate reviewer for this draft. The
Transport Directorate seeks to review all
transport or transport-related drafts as they
pass through IETF last call and IESG review, and
sometimes on special request. The purpose of the
review is to provide assistance to the Transport
ADs. For more information about the Transport
Directorate Reviews and the Transport Area
Review Team, please see
https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
In this case, very very few of the review
comments relate to transport issues, although
the greatest issue concerns a desire that the
network could pause or stop connections during
L3 VM Mobility, which is certainly a transport
issue.
[Linda] There is “Hot Migration” with
transport service continuing, and there is a
“Cold Migration”, which is a common practice in
many data centers, which stop the task running
on the old place and move to the new place
before restart as described in the Task
Migration.
Is it helpful to add this description to the
draft?
==Summary==
The technical aspects of the draft concerning
L2 VM mobility (within a subnet) seem sound.
However, this is only part of the draft, which
has the following
issues:
#. The introduction does not say what the
purpose of publishing this draft is.
It seems that, rather than describing a
specific protocol or protocols, it intends to
describe the overall system procedure that would
typically be used in DCs for VM mobility. It is
tagged as a BCP, but it does not say who needs
this BCP, why it is useful for the IETF to
publish this BCP, how wide the authors'
knowledge is of current practice (given DCs are
private), or why this is a BCP rather than a
protocol spec.
[Linda] The first paragraph on Page 3 has the
description why VM Mobility is needed. Is it
helpful to move this paragraph to the beginning
of the Introduction Section?
“Virtualization
which is being used in almost all of today’s
data
centers enables many
virtual machines to run on a single physical
computer or compute server.
Virtual machines (VM) need hypervisor
running on the physical
compute server to provide them shared
processor/memory/storage.
Network connectivity is provided by the
network virtualization edge
(NVE) [RFC8014]. Being able to move VMs
dynamically, or live
migration, from one server to another allows
for
dynamic load balancing or
work distribution and thus it is a highly
desirable feature
[RFC7364].”
The draft starts out (S.3) as if it intends
to say what a good VM Mobility protocol should
or shouldn't do, but the rest of the document
doesn't give any reasoning for these
recommendations, it just asserts what appears to
be one view of how a whole VM Mobility system
works, sometimes referring to one example
protocol RFC for a component part, but more
often with no references or details.
[Linda] Is it helpful to move the paragraph
above to the beginning of the Introduction
Section? So that audience is aware of why VM
Mobility is needed. And then follow up with what
a good VM Mobility protocol should or shouldn't
do?
#. It does not seem as if the NVO WG has
discussed the purpose of using normative text in
this draft. See detailed comments.
[Linda] The “Intended status” of the draft is
“Best Current Practice”. So all the text are not
“normative”. Is it Okay?
#. The draft silently slips back and forth
between VM mobility and VM redundancy, without
recognizing the differences. See detailed
comments.
[Linda] There is only one usage of
“redundancy” in the entire document, used under
the context of “Hot standby option”, indicating
the “redundancy” of “the VMs in both primary and
secondary domains have identical information and
can provide services simultaneously as in
load-share mode of operation” being expensive.
#. Please adopt different terminology than
"source NVE" and "destination NVE", which are
really poor choices of terms for an intermediate
node. See detailed comments. Why not use "old
NVE" and "new NVE", which is what you mean?
[Linda] Thanks for the suggestion. We will
change to “Old NVE”, and “new NVE”.
#. Applicability is fairly clearly outlined,
but it is not clear whether hosts corresponding
with the mobile VMs are part of the same
controlled environment or on the uncontrolled
public Internet. See detailed comments.
[Linda] “Hosts” are the App running on the
VM. It is the under the same controlled
environment. Not on uncontrolled public
internet.
#. Section 4.2.1 on L3 VM mobility reads like
some potential half-thought-through ideas on how
to solve L3 mobility, rather than current
practice, let alone best current practice.
Either current practice should be described
instead, or the scope of the draft should be
narrowed solely to L2 VM mobility. See detailed
comments.
[Linda] This is refereeing to “Cold
Migration”, which is a common practice in many
data centers.
# The VM's file system is described as state
that moves with the VM (S.6), but VM mobility
solutions often move the VM but stitch it back
to its (unmoved) storage. Conversely, the
storage can also move independent of the VM.
[Linda] It depends. When a VM move to a
different zone, the storage/file can becomes
inaccessible.
#. The draft omits some of the security,
transport and management aspects of VM mobility.
See detailed comments.
[Linda] Can you provide some text?
#. The draft reads as if different sections
have been written by different authors and
no-one has edited the whole to give it a
coherent structure, or to ensure consistency
(both technical and editorial) between the
parts. See detailed comments.
[Linda] we can improve.
#. The quality of the English grammar does
not allow a reviewer to concentrate on the
technical aspects rather than the English. It
would have been useful if one of the
English-speaking co-authors had improved the
English before submission for review. See
detailed comments.
[Linda] can you help? Becoming a co-author
to improve?
==Detailed Comments==
===#. Normative statements===
In the body of the document, there is just
one occurrence of normative text (actually two
"MUST"s, but both state a common requirement -
just written separately for IPv4 and IPv6). This
merely serves to imply that everything else the
document says is less important or optional,
which was probably not the intention.
[Linda] The goal is to indicate any solution
in moving the VM “MUST” follow this rule. They
make sense, aren’t they?
At the start there is a requirements section,
which states what a VM Mobility protocol
"SHOULD" or "SHOULD NOT" do. I think this is
intended as a set of goals for the rest of the
document. If so, these "SHOULDs" are not
intended to apply to implementations, so they
ought not to be capitalized.
[Linda] okay, will change.
The first requirement, "Data center network
SHOULD support virtual machine mobility in
IPv6", is written as a requirement on all DC
networks, not on implementations. I assume this
was intended to read as "Data center network
virtual machine mobility protocols SHOULD
support IPv6". Even then, it doesn't really add
anything to say VM mobility should support v6
and it should support v4. A L2 solution won't.
While undoubtedly, a L3 solution will at least
support one of them.
[Linda]Agree. Will change it to “Data center
that support IPv6 address should …”
I'm not sure that 'protocol' is the right
word anyway; I think 'VM Mobility procedure'
would be a better phrase, because it includes
steps such as suspending the VM, which is more
than a protocol.
[Linda] yes. Will change to “Procedure”.
The requirement "Virtual machine mobility
protocol MAY support host routes to accomplish
virtualization", is not followed up at all in
the rest of the draft.
Even if this requirement stays, the last 3
words should be deleted.
[Linda] will change to “Host
Route can be used to support the Virtual
Machine Mobility Procedure.”
By the end of the draft, the solution falls
far short of the most relevant "Requirements"
anyway, so one assumes the title of the section
ought to have been "Goals". Specifically, even
in the simpler case of L2 VM mobility, S.4.1
says that triangular routing and tunnelling
persist "until a neighbour cache entry times
out". A cache timeout is about 10 orders of
magnitude longer than the requirement to only
persist "while handling packets in flight",
which would be a few milliseconds at most (the
time for packets to clear the network that were
already launched into flight when the old VM
stopped).
Whatever, it would be preferable for the
draft to give rationale for these requirements,
rather than just assert them. This would help to
shed light on the merits of the different trade
offs that solutions choose.
[Linda] Agree, will add.
===#. Mobility vs. Redundancy===
Redundancy and mobility have a lot of
similarities, but they have different goals.
With mobility, it is necessary to know the exact
instant when one set of state is identical to
the other so it can hand over. With redundancy,
the aim is to keep two (or more) sets of state
evolving through the same sequence of changes,
but there is no need to know the point at which
one is the same as the other was at a certain
point.
[Linda] Agree with what you said. There is
only one usage of “redundancy” in the entire
document, used under the context of “Hot standby
option”, indicating the “redundancy” of “the
VMs in both primary and secondary domains have
identical information and can provide services
simultaneously as in load-share mode of
operation” being expensive.
The draft slips from mobility to resilience
in the following places:
* S.2. Terminology: Warm VM Mobility is
defined without any ending, as if it is
permanent replication. * S.7. "Handling of Hot,
Warm and Cold Virtual Machine Mobility" is
actually all about redundancy, and doesn't
address mobility explicitly.
[Linda] Will add the definition “Hot
Migration”, “cold migration”, and “warm
migration”.
===#. Terminology===
Packets run from the source at A to the
destination at B via NVE1, then via NVE2. Please
don't call NVE1 and NVE2 the source NVE and the
destination NVE.
In future, no-one will thank you for the
apparent contradictions when they continually
stumble over phrases like this one in S.4.1:
"...send their packets to the source NVE".
The term "packets in flight" is used
incorrectly to refer to all the packets sent to
the old NVE after the VM has moved, even if they
were launched into flight long after the old VM
stopped receiving packets.
[Linda] thank for the comments. Will change.
BTW, I think s/before/after/ in: "that have
old ARP or neighbor cache entry before VM or
task migration".
I think: s/IP-based VM mobility/L3 VM
mobility/ throughout, because "based"
sounds (to me) like the mobility control
protocol is over (i.e. based on) IP.
===#. Applicability===
In section 4.2 it says that the protocol
mostly used as the IP based task migration
protocol is ILA. This implies that all hosts
corresponding with the mobile VMs are either
part of the same controlled environment, or they
are proxied via nodes that are part of the same
controlled environment (I only have passing
knowledge of ILA, but I understand that it
depends on ILA routers on the path). If I am
correct, this aspect of scope needs to be made
clear from the start.
Also under the heading of applicabiliy, the
sentence "Since migrations should be relatively
rare events" appears very late in the document
(S.4.2.1). The assumed level of churn ought to
be stated nearer the start.
[Linda] yes, under the same controlled
environment.
===#. L3 Mobility===
L2 VM mobility is independent of the
application, because resolution of L2 mappings
is delegated to the stack. In contrast, L3 VM
mobility is only feasible under certain
conditions, because an application needs an IP
address to open a socket (resolution of DNS
names is not delegated to the stack, and apps
can use IP addresses directly anyway).
Examples of the 'certain conditions':
a) /All/ applications used in the whole DC
load balancing scheme contain IP address
migration logic for /all/ their connections; b)
VMs running solely applications that support IP
address migration register this fact with the
NVA, and it only select such VMs for mobility.
c) An abstraction is layered over /all/ the IP
addresses exposed to applications (at both ends)
so that the IP addresses that applications use
are solely identifiers (e.g. ILA, LISP, HIP),
not also locators.
The introduction says the draft is about VM
mobility in a multi-tenant DC, so the DC admin
will not know the range of applications being
used. This excludes condition (a) above. When
the draft says "...if all applications running
are known to handle this gracefully...", it
doesn't quantify just how restrictive this
condition is, and it gives no explanation of how
this knowledge might be 'known' or which
function within the system 'knows' it.
S.4.2.1 contains what seems like plenty of
arm-waving.
* "TCP connections could be automatically
closed in the network stack during a migration
event."
o There is no TCP connection state in
the network stack.
o Even if the network starts to drop
every packet, the TCP connection
state persists in the end-points for
a duration of the order of 30-90
minutes (OS-dependent) before TCP
deems the connection is broken. o
Other transport protocols have
similar designs (including the app-layer
of protocols over UDP).
* "More involved approach to connection
migration":
o pausing the connection [does this
refer to an actual feature of any
L4 protocol?] o packaging connection
state and sending to target [does
this assume logic written into the
application, or is this assuming the
stack handles this and the app is
restricted to using some form of
separate identifier/locator
addresses?] o instantiating connection
state in the peer stack [ditto?].
There's some arm-waving in S.7 too:
"Cold Virtual Machine mobility is
facilitated by the VM initially
sending an ARP or Neighbor Discovery
message at the destination NVE
but the source NVE not receiving any
packets inflight."
[How is it arranged for the source NVE not
to receive any packets in flight?]
And in S.7:
"In hot
standby option, regarding TCP connections,
one option is to start
with and maintain TCP connections to two
different VMs at the same
time."
[This sounds like resilience logic has
been written into the application,
which would be a special case but not
something VM mobility infrastructure
could depend on.]
[Linda] will add.
===#. Gaps===
#. Security Considerations: repeats issues in
other drafts that are not specific to mobility,
but it does not mention any security issues
specifically due to VM mobility. It says that
address spoofing may arise in a DC (sort-of
implying it is worse than in non-DC
environments, but not saying why). The handshake
at the start of a connection (e.g. TCP, SCTP,
QUIC) checks for source address spoofing. So L3
VM mobility would be more vulnerable to source
address spoofing in cases where the mobile VM
was the connection initiator and there was not a
new handshake after the move. However, this
draft does not contain any detailed mobility
protocols, so it is not possible to identify any
specific security flaws.
#. Transport Issues: Effect of delay on the
transport: Cold mobility introduces significant
delay, and other forms less, but still some
delay. It should be pointed out that some
applications (e.g. real-time) will therefore not
be useful if subjected to VM mobility.
Similarly, even a short period of delay will
drive most congestion controls to severely
reduce throughput. These points might be
self-evident, but perhaps they should be stated
explicitly.
BTW, in the L3 VM mobility case, the draft
often refers to TCP connections, but the address
bindings of any transport protocols would have
to be migrated due to VM mobility (e.g. SCTP;
sequences of datagrams over UDP; streams over
UDP such as with RTP, QUIC).
#. Management Issues: perhaps the draft ought
to recommend statistics gathering (e.g. time
taken, amount of duplicate data) to aid a DC's
future decisions on the cost-benefit of moving a
VM. The OPSDIR review says a BCP does not /have/
to describe management issues, but this document
seems to describe a whole system procedure, not
just a protocol, which then surely includes the
management plane.
[Linda] can you become a co-author and add
those in?
===#. Incoherent Structure===
S.4.1. happens to talk about VMs moving,
while S.4.2. happens to talk about tasks moving,
but this is not the distinguishing aspect of
these two sections (anyway, S.2. says "the draft
uses task and VM interchangeably"): * "4.1 VM
Migration" is about "L2 VM Mobility" so this
ought to be the section heading, *
"4.2 Task Migration" is about "L3 VM
Mobility" so this ought to be the section
heading. It would also help not to switch from
VM to task across these sections
- it's just a distraction.
S.4.1 needs better signposting of where each
sub-case ends (Subsections might be useful to
solve this): * IPv4 * end-user client * 2 paras
starting "All NVEs communicating with this
virtual machine..." [Not clear that the end-user
case has ended and we have returned to the
general IPv4 case?] * IPv6 [Strictly, it still
hasn't said whether the end-user client case has
ended.] [Also, it doesn't explain why there is
no need for an end-user client case under IPv6?]
Sections 5 & 6 seem to be about either L2 or
L3 mobility, whereas Sections 7 &
8 seem to be restricted to L2.
The draft vacillates over what to do with
packets arriving at the old NVE in the L3 case
(see also L3 mobility above): * S4.2 first says
packets are dropped, possibly with an ICMP error
message;
o then later it says they are silently
dropped;
o then in the very next sentence it says
either silently drop them or forward
them to the new location
* S.5 says they should not be lost, but
instead delivered to the destination hypervisor
o then it describes how they are tunnelled
(which is not the same as
"forwarding").
The order in which all the stages of mobilty
are given is jumbled up across sections that
also appear in arbitrary order: * S.5 prepares,
establishes uses then stops a tunnel, but it
doesn't say where the other stages fit between
these steps
o When tunneling packets, it talks
about the *migrating* VM not the
*migrated* VM, which implies
tunnelling has started before the new VM
is running. Does this imply there is
a huge buffer? o It says "Stop
Tunneling Packets - When source NVE
stops receiving packets destined
to..." but it is never clear when a
source has stopped sending packets
to a destination, unless it
explicitly closes the connection (e.g. with
a FIN in the case of TCP). Often
there are long gaps between packets,
because many flows are 'thin'
(meaning the application frequently has
nothing to send). These gaps can last
for milliseconds, hours or even
days without any implication that the
connection has ended.
* Then S.6. describes moving state, but
doesn't say that this is not after the previous
tunnelling steps (or where it fits within those
steps). * Then S.7 describes hot, warm and cold
mobility, but doesn't lay out the tunnelling or
steps to move state in each case. * Then S.8
says it's about VM life-cycle, but just gives
the very first 3 steps for allocation of
resources to a VM, then abruptly ends, without
even starting the VM, let alone getting to move
it.
S.5 exhibits another inconsistency by talking
about the hypervisor, not the NVE.
==#. Nits==
Nits with the English are too numerous to
mention them all. Below are pointers to general
problems as well as some individual instances.
S.4
"Layer 2 and Layer 3 protocols are
described next. In the following
sections, we examine more advanced
features."
s/following/subsequent/
S.4.1
Expand WSC, MSC and NVA on first use.
s/the VM moves in the same link/the VM moves
in the same subnet/
"i.e. end-user clients ask for the same MAC
address upon migration. [...] to ensure that the
same IPv4 address is assigned to the VM." I
think s/IPv4/MAC/ was intended?
" All NVEs communicating with this virtual
machine uses the old ARP
entry. If any VM in those NVEs need to
talk to the new VM in the
destination NVE, it uses the old ARP
entry."
Repetition: these 2 sentences say the same.
(The mistake is also repeated when these 2
sentences are repeated for IPv6).
S.4.2.1
s/Push the new mapping to hosts./Push the new
mapping to communicating hosts./
S.5.
The IPv4/IPv6 pairs of paras for "tunnel
estabilshment" and "tunneling packets"
only differ in the words "IPv4"/"IPv6". So in
each case a single para could be given for IP
(irrespective of whether v4 or v6).
Thank you very much.
Linda Dunbar