Re: [6tisch] [IoT-DIR] Iotdir early review of draft-ietf-6tisch-6top-protocol-09

Xavi Vilajosana Guillen <xvilajosana@xxxxxxx> · Thu, 22 Mar 2018 12:59:25 +0100

Dear Alex,
thanks so much for going through the document again. After the discussion at the WG meeting we can proceed to resolve your pointed issues. Inline with your questions we provide our answers. We will publish v11 of the draft in the following days including the resolution presented here. 

2018-03-06 18:38 GMT+01:00 Alexander Pelov <a@xxxxxxx>:
Dear Xavi,
Thanks a lot for the prompt reply and the detailed address of the points I’ve outlined. 

For completeness, see inline.

Le 1 mars 2018 à 12:50, Xavi Vilajosana Guillen <xvilajosana@xxxxxxx> a écrit :

Dear Alex,
Thanks so much for your constructive review. Let us answer inline your comments (XV:). We are taking them into account in the new draft version that will be published before the cut-off date.

regards
Xavi
----
Reviewer: Alexander Pelov
Review result: Ready with Issues

Hello all,

This is the review for the IoT Directorate.

Document: draft-ietf-6tisch-6top-protocol-09
Reviewer: Alexander Pelov
Date: 22 February 2018

The general feeling of the reviewer is that the document is a solid work.
Multiple examples are given and the document is easy to understand as a whole.
There are some nits, and some text that need to be clarifier.

The general feeling of the reviewer is that the document relies heavily on the
definition of an external Scheduling Function (SF). The recommended values seem
very reasonable to the reviewer and it is not clear what is the benefit of
anticipating that an SF can override the semantics of most of the fields. For
example, most of the fields are opaque to the 6P sublayer and only make sense
to the SF : CellOptions, Metadata, CellList. For one, in Wireshark, there will
be the need for separate disector for each SF.

XV: we aimed to support particular needs of an SF. For example the Metadata field can be used to indicate to what Slotframe Handle the 6P operation should be applied. However we think as well that a large set of SFs will use the fields as defined by 6P (celllist) for example.

Ok. Thanks for the example with the Metadata field; 

A final point here is that there seem to be no readily available polished SF
that would help in the understanding of the concepts beyond what is already on
the 6P draft. 

XV: We think the MSF (https://tools.ietf.org/html/draft-chang-6tisch-msf-00) clearly maps to the requirements from 6P.

Thanks for clearing this out. If MSF becomes a WG item I would find it useful to add it to the text as a reference (that’s a very minor edit and shouldn’t interfere with the closing of the WGLC).

XV: The MSF has been added as a reference in the Figure/table 38. Basically with SFID = 0

 For example, the SF0 draft (draft-ietf-6tisch-6top-sfx-00)
redefines quite heavily the behavior of CellList introducing notion of
WhiteList and BlackList of cells. The reviewer is aware that these are distinct
works, but it feels that there should be a minimum level of interoperability,
where an upper layer does not completely redefine what is happening on lower
layers. Having extension mechanisms may seem like a better way to solve
richness of proposals if this is necessary.

XV: We agree on that. We think however that most SFs can be developed without redefining the 6P fields. Note also that the SIGNAL command is designed to that aim. i.e., an SF issues SIGNALS which are opaque to 6P internals in order to transmit information to the other Node SF.

Ok.

One point which remained unclear is how do the Minimal 6TiSCH and 6P interact?
It could be useful to provide a description on the bootstrap of 6P interaction
(how does a sender A initiate the first 6P Request - over Minimal
6TiSCH-managed cell?).

XV: This is detailed in the MSF draft for example. 6P defines the messaging structure and protocol interaction but does not define a particular behaviour.
The SF is the responsible of defining the behaviour at boot, what cells are used and how new cells are added. 6P Provides the l2 transport semantics for the 
SF to operate. 

Ok, thanks. One more reason to add reference to MSF.

XV: thanks. We agree with you.  

How do they enter in play in case of de-synchronisation
? (e.g. A rescheduling all 6P cells, but B not getting the final L2 ACK, which
puts A's 6P cells on a completely different schedule than B's.. so B can't
signal back transaction rollback / CLEAR). Is this solved by 6TiSCH minimal or
through a different mechanism?

I think that independently of MSF, this is a situation that could apply to any SF. It may then be of interest to describe in the text of 6P how is this situation handled. (even if MSF handles this gracefully, other SFs may benefit from guidance how to handle such situation).

The Security section could be enriched. A notable example is the handling of
resource reservation, which could lead to DOS attacks.

XV: this has been clarified thanks to another reviewer comment. 

Ok, thanks.

- - - - - - - - - - - - - - -

Section 3.2.3:
6P CellOptions - ends with the statement that it is an "Opaque set of bits",
which MAY be redefined by the SF (format, meaning). As pointed out earlier, if
there is a need to redefine this for each SF, maybe there are other ways of
defining such flexibility (e.g. TLVs).

XV: We enable an SF to redefine that field but we do not expect that most of the SFs redefine it. 

If you feel this could be a blocking issue down the road, then I’m OK with leaving it as is. 

The table in Figure 7 provides the recommended meaning of the bitmap for 6P
COUNT and 6P LIST. What is the recommended meaning for 6P ADD/DELETE/RELOCATE?

XV: thanks, we clarify this in the text. We added Figure 8 with a table describing the 
behaviour of 6P when the different cellOPtions are present in ADD/DELETE/RELOCATE requests.

Ok, thanks.

Nits: there seems to be errors in Figure 7: examples of "all cells are marked
as RX" and "all cells are marked as TX" seem inverted (same for TX=1,RX=0,S=1
and TX=0,RX=1,S=1).

XV: no, this is correct :). The request is issued by node A, saying for example COUNT TX cells to B. B responds with the list of cells marked as RX in its schedule for neighbor A. (as in A they are marked as TX).

Oh, I see. Adding a sentence to refresh the memory of the people like me would help here ;)

XV: Figures 7 and 8 have been swapped in order, first we detail the meaning of the celloptions field of ADD/DELETE etc. as they are detailed first in the draft and second we add the Figure for COUNT and LIST. In this way we follow the same order as the commands. The text describing the meaning of the options is as follow in the table header:

(Assuming node A issues the 6P command to node B.)

Text in the header: the type of cells B adds/deletes/relocates to its schedule when receiving a 6P ADD/DELETE/RELOCATE Request from A.

we think this clarifies the meaning of the fields.
The same applies for Figure 8. 

- - - - - - - - - - - - - - -

Section 3.3.1:
How does the sender/receiver know the size of CellList? (infer from packet
size?)

XV: the IE header contains the size of the 6top IE. The header field sizes are known and hence the celllist length can be determined from that. 

Can you please add a clarification to the text for this? 

XV: We added a clarification in section 3.2.4
" As a clarification, the length of the CellList field is determined by
   the IE Length field, present in the Payload IE header as defined by
   the IEEE 802.15.4 standard [IEEE802154]. "

and in the commands description:
"
CellList:  A list of 0, 1 or multiple candidate cells.  Its length is
         determined by the Length field of the Payload IE header. 
"

The candidate cells (a total of NumCandidate) are presumably provided by the
SF. However, it is up to 6P to handle the case when they do not fit in the
packet size. The text specifies that this should be handled in more than one 6P
ADD requests - which is OK on the conceptual level, but seems underspecified
for an implementation. 

What if NumCells is smaller than the number of candidate
cells that can fit in a single transaction - should they be also split in two
transactions?

XV: NumCells tells how many cells need to be added/deleted/relocated. I think you refer to the 6P list command instead. 
In a 6P LIST Command we use MaxNumCells which indicate an upper bound of the cells to be listed (lique in SQL when we do LIMIT).
If in a 6p LIST the number of returned cells is smaller than MaxNumCells, then the issuer may send another LIST with an specific 
offset (e.g the number of cells received) in order to get the remaining cells. 
This is how "pagination works" indeed.

No, I meant the 6P ADD case, but I was thinking about the 3-step 6P transaction. When B returns the list of candidate cells, how is that handled in case it cannot fit the 6P packet? (again, it can be trivial)

XV: Hi Alex, we clarified this part in the 6P add command section (3.3.1.  Adding Cells)

"In case of a 3-step transaction, the SF is reponsible of ensuring that the 

returned candidate celllist fit in the 6P Response packet."

 What happens if the first 6P ADD is successful, but the second
one fails? Should the sender 6P DELETE the successfully added first batch of
cells?

XV:If you refer to an ADD Request where numcells is larger than the number of cells that fit in a packet, then this should be handled by the SF, splitting the request
in multiple ADD operations. If one fails a node can retray later.  

Ok, thanks.

Can allocation of 0 cells be considered as partial success?

NOALLOC return code is not defined.

XV: Why someone wants to do a 0 cell ADD request? I think that the response in this case can be RC_SUCCESS, indicating that 0 cells have been scheduled.

"The returned list can contain NumCells elements (succeeded) or
   between 0 and NumCells elements (partially succeeded)."

I read this as inclusive between, e.g. 0 and NumCells included. This for me meant that an addition of 0 elements was considered a partial success. 

XV: We have reviewed this part and made it more clear.:

"The verification can
                       succeed           (NumCells cells from the CellList can be used),
                       fail              (none of the cells from the CellList can be used), or
                       partially succeed (less than NumCells cells from the CellList can be used).
                    In all cases, node B MUST send a 6P Response with return code set to RC_SUCCESS, and which specifies the list of cells that were scheduled following the CellOptions field.
                    That can contain
                        NumCells elements               (succeed),
                        0 elements                      (fail), or
                        between 0 and NumCells elements (partially succeed). 
"

- - - - - - - - - - - - - - -

Section 3.3..3.
Figure 17 - it seems counterintuitive to have RC_SUCCESS on failed relocation.
Could NOALLOC be used in this case?

XV: We had long discussions about this while we wrote the specification. Our consensus was that the return code indicates that the 6P transaction worked (RC_SUCCESS)
and that the celllist length tells us the result in terms of cells relocated. So no strong arguments in both sides I guess but we agreed to take that approach.

If you’ve already had a long discussion on this on the ML it works for me. Thanks.

In both Relocation and Allocation 3-step 6P transaction there is the risk of a
security attack. If a malicious node constantly renews 3-step requests and
never acknowledges, the neighboring node will be keeping the proposed cells as
"reserved" and not allocate them to other nodes, thus provoding a DOS attack.
Probably a way to limit repeated requests could be useful for this case.

XV: We know that any 3-step transaction/protocol can be subject to a DoS attack as long as one of the messages is 
not replayed (same happens with TCP handshake attacks). We do not want to introduce policies to handle that for a particular 
situation but we think that this needs to be clarified in the security considerations section. 
To this aim we indicated the following: 
We added a consideration in the security section. 

The 6P protocol does not provide protection against DOS attacks. This is relevant in 3-step transactions when a confirmation message could not be sent
in purpose by the attacker. Such situations SHOULD be handled by an appropiate policy such as blacklisting the attacker after several attempts.
Other DoS attacks are possible by sending unmeaningful requests to nodes. The effect to the overall network can be minimal as communication between attacked node
and attacker happen in dedicated cells. DoS then only limits that cells. Yet, this can be avoided by blacklisting the node after several attempts. When to blacklist 
is policy specific and SHOULD be addressed by the SF. 

Perfect, thanks !

- - - - - - - - - - - - - - -

Section 3.3.5.
"To retrieve the list of scheduled cells at B" - all cells scheduled at B? Or
the cells scheduled for A? (could be clarified)

XV: We rephrased like this:
 To retrieve a list of scheduled cells at B, node A issues a 6P LIST command..

Ok, sounds good. Can you add a sentence of the sort: "This list is only limited to the cells scheduled for A" (if this is true) ?

XV: We clarified further the text as follows 
 "To retrieve a list of scheduled cells node A has with node B, node A issues a 6P LIST command."

Nits: Node B MAY returns -> Node B MAY return
XV: Thanks. we fixed that.

Ok thanks. 

- - - - - - - - - - - - - - -

Section 3.3.6.
There may be two parallel transactions: 1) A->B and 2) B->A. If a 6P CLEAR is
issued on one, how does this affect the other? (presumably clear both?)

XV: A transaction is not applied until the transaction is committed, this is the Confirmation message is received on one side and the L2 ACK is received at the other side.
In a particular slot a node may be only receiving or sending a packet at a time and hence this B->A A->B transaction cannot happen unless they use 2 radios..
In case of using 2 radios this may lead to an inconsistency in the schedules that will be resolved in the next message thanks to the SeqNum set to 0 in the side that cleared.

If both A and B start 3-step transaction, wouldn’t that allow for parallel transactions to exist, even in single radio? Also, is the use of 2 radios excluded? 
The way I understand your answer, there is no explicit influence from one transaction to the other - it’s up to the SeqNum to settle the situation.

Works fine for me and is simple, thanks for clarifying.

How does this affect separate SF? If there is a state kept by each SF, are all
SFs cleared? Are statistics also cleared for SFs? (probably SF-dependent, out
of the scope)

XV: the commands use an SFID that maps the action to a particular SF. Hence if a clear happens in the cells scheduled by one SF other cells scheduled by another SF won't be affected.

Oh, OK, so now I have another question - how are the SeqNum handled across SF? Is there a SeqNum counter PER SF? 
By reading the text, "That is, a node stores as many SeqNum values as it has neighbors." - that would imply that the SeqNum counter is shared by the SF (e.g. it is on the 6P level, which seems consistent to me).

XV: good point. We clarified that in the text, Section 3.4.6. There is a SeqNum per neighbor per SF. This means that if there are multiple SFs running concurrently each SF will use a SeqNum for each of the neighbor nodes.

the text now reads as:
"In case of supporting multiple SFs at a time, a SeqNum value is maintained per SF and per neighbor." 

- - - - - - - - - - - - - - -

Section 3.4.6.
Figure 27: "Clear or Reset" - Reset could be ambiguous (device has restarted vs
transaction failed, RC_RESET)

XV: We clarified with:
Clear or After device Reset

Ok, thanks.

- - - - - - - - - - - - - - -

Section 6.2.5.
Consider having Specification required for the range SFID 128-255.
XV: Expert review is a well understood term.

Yes, so is Specification required. Has this policy been discussed on the ML? 

XV: We talk about it informally at some IETF meeting (maybe in Berlin) but this was never discussed as far as I know at the ML. Do you see any problem with it?

- - - - - - - - - - - - - - -

Section 6.2.4.
It would have seem more readable to have RC_ERR_ prefix for errors. It may not
be outright evident that RC_CELLLIST or RC_VERSION is an error.

XV:Thanks for this comment. We renamed them as indicated.

Ok, thanks.

- - - - - - - - - - - - - - -

Overall a rich document, with probably some minor changes to be made.

Best,
Alexander

Thanks again, Xavi, for the prompt reply and for making the necessary changes!

XV: Thanks to you for this detailed and constructive review. We really appreciate it.  
Alex

2018-02-23 1:35 GMT+01:00 Alexander Pelov <a@xxxxxxx>:
Reviewer: Alexander Pelov

Review result: Ready with Issues

Hello all,

This is the review for the IoT Directorate.

Document: draft-ietf-6tisch-6top-protocol-09

Reviewer: Alexander Pelov

Date: 22 February 2018

The general feeling of the reviewer is that the document is a solid work.

Multiple examples are given and the document is easy to understand as a whole.

There are some nits, and some text that need to be clarifier.

The general feeling of the reviewer is that the document relies heavily on the

definition of an external Scheduling Function (SF). The recommended values seem

very reasonable to the reviewer and it is not clear what is the benefit of

anticipating that an SF can override the semantics of most of the fields. For

example, most of the fields are opaque to the 6P sublayer and only make sense

to the SF : CellOptions, Metadata, CellList. For one, in Wireshark, there will

be the need for separate disector for each SF.

A final point here is that there seem to be no readily available polished SF

that would help in the understanding of the concepts beyond what is already on

the 6P draft.  For example, the SF0 draft (draft-ietf-6tisch-6top-sfx-00)

redefines quite heavily the behavior of CellList introducing notion of

WhiteList and BlackList of cells. The reviewer is aware that these are distinct

works, but it feels that there should be a minimum level of interoperability,

where an upper layer does not completely redefine what is happening on lower

layers. Having extension mechanisms may seem like a better way to solve

richness of proposals if this is necessary.

One point which remained unclear is how do the Minimal 6TiSCH and 6P interact?

It could be useful to provide a description on the bootstrap of 6P interaction

(how does a sender A initiate the first 6P Request - over Minimal

6TiSCH-managed cell?). How do they enter in play in case of de-synchronisation

? (e.g. A rescheduling all 6P cells, but B not getting the final L2 ACK, which

puts A's 6P cells on a completely different schedule than B's.. so B can't

signal back transaction rollback / CLEAR). Is this solved by 6TiSCH minimal or

through a different mechanism?

The Security section could be enriched. A notable example is the handling of

resource reservation, which could lead to DOS attacks.

- - - - - - - - - - - - - - -

Section 3.2.3:

6P CellOptions - ends with the statement that it is an "Opaque set of bits",

which MAY be redefined by the SF (format, meaning). As pointed out earlier, if

there is a need to redefine this for each SF, maybe there are other ways of

defining such flexibility (e.g. TLVs).

The table in Figure 7 provides the recommended meaning of the bitmap for 6P

COUNT and 6P LIST. What is the recommended meaning for 6P ADD/DELETE/RELOCATE?

Nits: there seems to be errors in Figure 7: examples of "all cells are marked

as RX" and "all cells are marked as TX" seem inverted (same for TX=1,RX=0,S=1

and TX=0,RX=1,S=1).

- - - - - - - - - - - - - - -

Section 3.3.1:

How does the sender/receiver know the size of CellList? (infer from packet

size?)

The candidate cells (a total of NumCandidate) are presumably provided by the

SF. However, it is up to 6P to handle the case when they do not fit in the

packet size. The text specifies that this should be handled in more than one 6P

ADD requests - which is OK on the conceptual level, but seems underspecified

for an implementation. What if NumCells is smaller than the number of candidate

cells that can fit in a single transaction - should they be also split in two

transactions? What happens if the first 6P ADD is successful, but the second

one fails? Should the sender 6P DELETE the successfully added first batch of

cells?

Can allocation of 0 cells be considered as partial success?

NOALLOC return code is not defined.

- - - - - - - - - - - - - - -

Section 3.3.3.

Figure 17 - it seems counterintuitive to have RC_SUCCESS on failed relocation.

Could NOALLOC be used in this case?

In both Relocation and Allocation 3-step 6P transaction there is the risk of a

security attack. If a malicious node constantly renews 3-step requests and

never acknowledges, the neighboring node will be keeping the proposed cells as

"reserved" and not allocate them to other nodes, thus provoding a DOS attack.

Probably a way to limit repeated requests could be useful for this case.

- - - - - - - - - - - - - - -

Section 3.3.5.

"To retrieve the list of scheduled cells at B" - all cells scheduled at B? Or

the cells scheduled for A? (could be clarified)

Nits: Node B MAY returns -> Node B MAY return

- - - - - - - - - - - - - - -

Section 3.3.6.

There may be two parallel transactions: 1) A->B and 2) B->A. If a 6P CLEAR is

issued on one, how does this affect the other? (presumably clear both?)

How does this affect separate SF? If there is a state kept by each SF, are all

SFs cleared? Are statistics also cleared for SFs? (probably SF-dependent, out

of the scope)

- - - - - - - - - - - - - - -

Section 3.4.6.

Figure 27: "Clear or Reset" - Reset could be ambiguous (device has restarted vs

transaction failed, RC_RESET)

- - - - - - - - - - - - - - -

Section 6.2.5.

Consider having Specification required for the range SFID 128-255.

- - - - - - - - - - - - - - -

Section 6.2.4.

It would have seem more readable to have RC_ERR_ prefix for errors. It may not

be outright evident that RC_CELLLIST or RC_VERSION is an error.

- - - - - - - - - - - - - - -

Overall a rich document, with probably some minor changes to be made.

Best,

Alexander

_______________________________________________

6tisch mailing list

6tisch@xxxxxxxx

https://www.ietf.org/mailman/listinfo/6tisch

-- 
Dr. Xavier Vilajosana
Wireless Networks Lab
Internet Interdisciplinary Institute (IN3)
Professor
(+34) 646 633 681
xvilajosana@uoc.edu
http://xvilajosana.org
http://wine.rdi.uoc.edu
Parc Mediterrani de la Tecnologia 
Av Carl Friedrich Gauss 5, B3 Building
08860 Castelldefels (Barcelona). Catalonia. Spain

_______________________________________________
IoT-DIR mailing list
IoT-DIR@xxxxxxxx
https://www.ietf.org/mailman/listinfo/iot-dir

_______________________________________________

6tisch mailing list

6tisch@xxxxxxxx

https://www.ietf.org/mailman/listinfo/6tisch

-- 
Dr. Xavier Vilajosana
Wireless Networks Lab
Internet Interdisciplinary Institute (IN3)
Professor
(+34) 646 633 681
xvilajosana@uoc.edu
http://xvilajosana.org
http://wine.rdi.uoc.edu
Parc Mediterrani de la Tecnologia 
Av Carl Friedrich Gauss 5, B3 Building
08860 Castelldefels (Barcelona). Catalonia. Spain