Hi again, [this time with all comments, including chapters 7 and 10, no changes in the first part so you can jump to [**CONT**] if you are looking for the new stuff] Sorry for the late comments, but I re-read the whole draft which took some time. There are still some clarifications necessary as well as some inconsistencies left. Unfortunately, my comments from June, 19th were not addressed yet. https://www.ietf.org/mail-archive/web/p2psip/current/msg06229.html Polina Goltsman also found many issues listed below while working on an implementation. The list is long, but the good news is that it's mainly clarifications that are missing. Major issues: 1) sec. 6.3.2: length field in Forwarding Header covers what exactly? it is not really clear whether the length field counts the whole message or in case of fragmentation only the fragment size. When reading Sec 6.7 is seems that the former is meant, but the definition could and should be clearer. Similarly, sec. 6.7 should be clear about this, e.g., describing that all Forwarding Headers are identical for fragments of the same message with exception of the fragment field. 2) sec. 7.4.1.1: replica number handling is unclear it is unclear how replica numbers are incremented (e.g., per peer) and what receiving peers should actually do with this number. Is it important to store the value or would a boolean be sufficient so that the peer knows that it's a replica? 3) sec. 6.5.2: transport protocol set for AppAttach Applications may require use of other transport protocols than those defined in OverlayLinkType (TLS/DTLS, what about plain UDP, SCTP, DCCP, etc.), but currently, this seems to be not possible (do the considerations of sec. 6.5.1.6 apply here?). 4) sec. 6.3.4: How can a different signature algorithm be used if not all implementations support it? There is no possibility to provide the feedback, that the signature algorithm is not acceptable at a particular node. 5) sec. 10.5: handling of parallel JOIN requests and use of peer_ready it is not clear what should happen if two JNs try to join at the same time at the same AP (can they be processed in parallel or should they be processed in sequence). Furthermore, when MUST/SHOULD a JN send peer_ready Update - in step 9? Minor issues: citations are put first, followed by comments starting with # sec. 1.1 ======== old: storage rather than for bulk storage of large objects. Records are stored under numeric addresses which occupy the same space as node new: storage rather than for bulk storage of large objects. Records are stored under numeric addresses, called Resource-IDs, which occupy the same space as node # Resource-IDs are used in 1.2.2 but not introduced sec. 1.2 ======== Message Transport: Handles end-to-end reliability, manages request ... # What are the interactions with Forwarding and Link Management and the Topology Plugin old: Forwarding and Link Management Layer: Stores and implements the routing table by providing packet forwarding services between nodes. It also handles establishing new links between nodes, including setting up connections across NATs using ICE. # It may be confusing to say "routing table" here, since this # is usually associated with the topology plugin. So IMHO # it is the Connection Table, not the routing table. # Furthermore, I propose the following change: old: including setting up connections across NATs using ICE. new: including setting up connections for overlay links across NATs using ICE. old: directly between nodes. TLS [RFC5246] and DTLS [RFC6347] are the currently defined "link layer" protocols used by RELOAD for hop- new: currently defined "overlay link layer" protocols used by RELOAD for hop- # avoids confusion with the classic ISO/OSI link layer (layer 2) old: In addition to the above components, nodes communicate with a central new: In addition to the above components, nodes may communicate with a central # while it may be the default case, it is not strictly required sec. 1.3 ======== old: RELOAD also provides an optional shared secret based admission control feature using shared secrets and TLS-PSK. In order to form a new: RELOAD also provides an optional shared secret based admission control feature using shared secrets and TLS-PSK/TLS-SRP. In order to # TLS-SRP should be mentioned here, too sec. 2 ====== old: Terms used in this document are defined inline when used and are also new: Terms in this document are defined inline when used and are also # avoid double "used" old: Bootstrap Node: A network node used by Joining Nodes to help locate the Admitting Peer. new: Bootstrap Node: A network node used by Joining Nodes to help accessing the overlay by forwarding messages to peers. # The bootstrap node does not locate the Admitting Peer, but the JN locates # the AP by routing a message to its own Resource-ID old: Connection Table: The set of nodes to which a node is directly connected, which include nodes that are not yet available for routing. new: Connection Table: Contains connection information for the set of nodes to which a node is directly connected, which include nodes that are not yet available for routing. # it is a data structure ... Node-ID: A value of fixed but configurable length that uniquely identifies a node. Node-IDs of all 0s and all 1s are reserved and are invalid Node-IDs. A value of zero is not used in the wire protocol but can be used to indicate an invalid node in implementations and APIs. The Node-ID of all 1s is used on the wire protocol as a wildcard. # what means "invalid" exactly? A wildcard can be used at least on # the wire so it is not invalid being used as destination Node-IDs. # So the above definition is slightly contradictory. old: Responsible Peer: The peer that is responsible for a specific resource, as defined by the plugin algorithm. new: Responsible Peer: The peer that is responsible for a specific resource, as defined by the topology plugin algorithm. old: Usage: An usage is the definition of a set of data structures (data Kinds) that an application wants to store in the overlay. An usage may also define a set of network protocols (application IDs) new: Usage: A usage is the definition of a set of data structures (data Kinds) that an application wants to store in the overlay. A usage may also define a set of network protocols (application IDs) # Typo: 2x An usage -> A usage sec. 3.1 ======== old: o To determine its position in the overlay topology (if the overlay is structured; topology plugins do not need to be structured). new: o To determine its position in the overlay topology (if the overlay is structured; overlays do not need to be structured). # a structured topology plugin is not the same as a structured overlay o To determine the set of resources for which the node is responsible. # this isn't necessarily true for unstructured overlays? old: The general principle here is that the security mechanisms (TLS at new: The general principle here is that the security mechanisms ((D)TLS at sec. 3.2 ======== entity. From the perspective of a peer, a client is a node that has connected to the overlay, but has not yet taken steps to insert # an additional reference to the Connection Table would be nice sec. 3.2.1 ========== clients that choose this option need to process Update messages # Update messages are not introduced yet, so a forward reference to # section 6.4.2.3 would be helpful performing an Attach. A client wishing to connect using this mechanism with a certificate with multiple Node-IDs can use a Ping (Section 6.5.3) to probe the Node-ID of the node to which it is connected before doing the Attach (Section 6.5.1). # At this point it is not really clear why this step is necessary. # Certificates with multiple Node-IDs were not explained yet. # Furthermore, the reference to Section 6.5.1 should be moved # to the previous sentence. sec. 3.3 ======== old: This section will discuss the capabilities of RELOAD's routing layer, new: This section discusses the capabilities of RELOAD's routing layer, old: Resource-based routing: RELOAD supports routing messages based solely on the name of the resource. Such messages are delivered new: Resource-based routing: RELOAD supports routing messages based solely on the name of the resource or Resource-ID. Such messages old: Clients: RELOAD supports requests from and to clients that do not participate in overlay routing, located via either of the mechanisms described above. new: Clients: RELOAD supports requests from and to clients that do not participate in overlay routing. # the addition is not really necessary and may be confusing old: Destination Lists: While in principle it is possible to just inject a message into the overlay with a single Node-ID as the new: Destination Lists: While in principle it is possible to just inject a message into the overlay with a single Node-ID or a single Resource-ID as the # Resource-ID is also a possible destination old: The basic routing mechanism used by RELOAD is Symmetric Recursive. new: The basic routing mode used by RELOAD is Symmetric Recursive Routing (SRR, cf. Section 6.2) # I would prefer to use the term "mode" (as on p. 28) and it should be # consistent with the text in section 6.2 old: opaque ID X1 which maps internally to [A, B] (perhaps by being an encryption of [A, B] and forwards to Z with only X1 as the via list. new: opaque ID X1 which maps internally to [A, B] (perhaps by being an encryption of [A, B]) and forwards to Z with only X1 as the via list. # simple typo old: RELOAD also supports a basic Iterative "routing" mode (where the new: RELOAD also supports a basic Iterative routing mode (where the old: Iterative "routing" is implemented using the RouteQuery method, which new: Iterative routing is implemented using the RouteQuery method, which old: requests this behavior. Note that iterative "routing" is selected new: requests this behavior. Note that Iterative routing is selected sec. 3.4 ======== pairs. The result is a connection between A and B. At this point, A and B MAY send messages directly between themselves without going through other overlay peers. In other words, A and B are on each other's connection tables. They MAY then execute an Update process, # MAY is RFC 2119 terminology, which is defined in section 5 # Besides Update process a Join process is also possible. order to support this case, some small number of "bootstrap nodes" typically need to be publicly accessible so that new peers can # what "publicly accessible" means exactly is not defined # typically need to be publicly accessible (i.e., not behind a NAT or # firewall) ... old: The second case is when a client connects to a peer at an arbitrary IP address, rather than to its responsible peer, as described in the new: The second case is when a client connects to a peer at an arbitrary node-ID, rather than to its responsible peer, as described in the # since responsible peer may depend on the overlay topology, node-ID # seems to be a better fit here sec. 3.5.2 ========== old: When a new peer wishes to join the Overlay Instance, it will need a Node-ID that it is allowed to use and a set of credentials which new: When a new peer wishes to join the Overlay Instance, it needs a Node-ID that it is allowed to use and a set of credentials which # not really sure about this change as non-native speaker match that Node-ID. When an enrollment server is used, the Node-ID used is the Node-ID found in the certificate received from the # the mode with self-signed certificates is missing and should be # mentioned also here "bootstrap node". Because this is the first connection the peer makes, these nodes will need public IP addresses so that they can be # may also work if the bootstrap node is directly reachable, e.g., # in the same domain # in this paragraph and the following paragraph "Once a peer" # is used three times past adjacencies which have public IP address and attempt to use them # inconsistent use of term "adjacencies"/adjacent within the document # different meanings as follows: # 1.) all directly connected nodes (i.e., all nodes in the Connection Table), e.g., sec. 3.5.2 and 6.4.2.3 # 2.) all peers in the routing table # 3.) adjacent according to the overlay topology, e.g. sec. 6.4.2.1 sec. 4 ====== limits on size, on the values which may be stored. For many Kinds, the set may be restricted to a single value; some sets may be allowed # what set? sec. 4.1.2. =========== old: responsibility if the responsible peer fail [Chord]. new: responsibility if the responsible peer fails [Chord]. sec. 6 ====== old: messages. We then describe the symmetric recursive routing model, which is RELOAD's default routing algorithm. We then define the new: messages. We then describe the symmetric recursive routing mode, which is RELOAD's default routing mode. We then define the # IMHO the term mode fits best, the routing algorithm is defined # within the topology plugin sec. 6.1. ========= old: peer SHOULD generate an appropriate error but local policy can new: peer SHOULD generate an appropriate error message but local policy can Once the peer has determined that the message is correctly formatted # what does "correctly formatted" mean exactly? sec. 6.1.1. =========== this node so it MUST verify the signature as described in Section 7.1 and MUST pass it up to the upper layers. "Upper layers" is used here to mean the components above the "Overlay Link Service Boundary" line in the figure in Section 1.2. # this is somewhat confusing. The text describes what the # Forwarding and link management component does, but what # other components are meant here? state, e.g., by unpacking any opaque IDS. # I think any is incorrect, since other IDs are # not inserted by this node and so it cannot "unpack" those, # but only its own opaque IDs sec. 6.1.2. =========== the first entry on the destination list is in the peer's connection table, then it MUST forward the message to that peer directly. # This is probably motivated by clients. A hint to this fact # may help. old: destination list, it would detect that I is a opaque ID, recover the new: destination list, it would detect that I is an opaque ID, recover the # Typo fix old: called List Compression. Possibilities for a opaque ID include a new: called List Compression. Possibilities for an opaque ID include a # Typo fix An intermediate node receiving a request from another node MUST return a response to this request with a destination list equal to the concatenation of the Node-ID of the node that sent the request with the via list in the request. The intermediate node normally # 1.) unclear why an _intermediate_ peer should return a response # (if it is not the destination node), # 2.) the via list must be reversed before concatenating the Node-ID # so: old: with the via list in the request. The intermediate node normally new: with the reversed via list in the request. The intermediate node normally sec. 6.1.3. =========== old: compressed via list), the peer MUST replace that entry with the original via list that it replaced and then re-examine the new: compressed via list), the peer MUST replace that entry with the reversed original via list that it replaced and then re-examine the # the via list must be reversed for responses... sec. 6.2. ========= old: This Section defines RELOAD's Symmetric Recursive Routing (SRR) algorithm, which is the default algorithm used by nodes to route messages through the overlay. All implementations MUST implement this routing algorithm. An overlay MAY be configured to use alternative routing algorithms, and alternative routing algorithms MAY be selected on a per-message basis. I.e., a node in an overlay which supports SRR and some other routing algorithm called XXX might use SRR some of the time and XXX some of the time. new: This Section defines RELOAD's Symmetric Recursive Routing (SRR) mode, which is the default mode used by nodes to route messages through the overlay. All implementations MUST implement this routing mode. An overlay MAY be configured to use alternative routing modes, and alternative routing modes MAY be selected on a per-message basis. I.e., a node in an overlay which supports SRR and some other routing mode called XXX might use SRR some of the time and XXX some of the time. # better use mode for consistency (algorithm is contained in the # topology plugin sec. 6.2.1. =========== old: node MAY also construct a more complicated destination list for source routing. new: node MAY also construct a more complicated destination list for (loose) source routing. Once the message is constructed, the node sends the message to some adjacent peer. If the first entry on the destination list is directly connected, then the message MUST be routed down that connection. Otherwise, the topology plugin MUST be consulted to determine the appropriate next hop. # adjacent peer should be adjacent node, since it may be a client # directly connected means: a valid entry in the Connection Table # exists? this should be mentioned here... sec. 6.2.2. =========== # a hint that the same Transaction-ID as in the request MUST be used # could be added. sec. 6.3.1.1. ============= Unless a given structure that uses a select explicitly allows for unknown types in the select, any unknown type SHOULD be treated as an # How is that allowance specified in the specification? Is it by the comment # /* This structure can be extended */ sec. 6.3.2. =========== receive message with a TTL greater than the current value of initial-ttl (or the 100 default) MUST discard the message and send an "Error_TTL_Exceeded" error. # what if the initial-ttl is larger than 100 and the TTL is >100 but # < initial-ttl? The condition "or the 100 default" holds old: used to indicate the fragment offset; see Section 6.7. new: used to indicate the fragment offset in bytes; see Section 6.7. old: length: The count in bytes of the size of the message, including the header. new: length: The count in bytes of the size of the whole unfragmented message, including the header. # as already mentioned in the beginning this should be more precise destinations which the message should pass through. The destination list is constructed by the message originator. The # is it allowed that intermediate peers add destinations? if not, please # state so old: next. The list shrinks as the message traverses each listed peer. new: next. The list may shrink as the message traverses each listed peer. # it need not be always the case that the list shrinks with each traversed peer sec. 6.3.2.2. ============= old: structure with a DestinationType of opaque_id_type and a opaque_id new: structure with a DestinationType of opaque_id_type and an opaque_id # typo fix old: opaque A compressed list of Node-IDs and an eventual Resource-ID. Because this value was compressed by one of the peers, it is only meaningful to that peer and cannot be decoded by other peers. Thus, it is represented as an opaque string. resource The Resource-ID of the resource which is desired. This type MUST only appear in the final location of a destination list and MUST NOT appear in a via list. It is meaningless to try to route through a resource. new: resource The Resource-ID of the resource which is desired. This type MUST only appear in the final location of a destination list and MUST NOT appear in a via list. It is meaningless to try to route through a resource. opaque_id_type A compressed list of Node-IDs and an eventual Resource-ID. Because this value was compressed by one of the peers, it is only meaningful to that peer and cannot be decoded by other peers. Thus, it is represented as an opaque string. # 1.) match the order in the select # 2.) it must be opaque_id_type, not opaque sec. 6.3.2.3 ============ flags Three flags are defined FORWARD_CRITICAL(0x01), DESTINATION_CRITICAL(0x02), and RESPONSE_COPY(0x04). These flags MUST NOT be set in a response. If the FORWARD_CRITICAL flag is # What is the correct reaction if these flags are set in a response? # (returning an Error_Invalid_Message or ignore?) sec. 6.3.3.1 ============ old: A node processing a request MUST return its status in the message_code field. If the request was a success, then the message new: A node processing a request MUST return its status in the message_code field of a response. If the request was a success, then the message # clarification? Error_Request_Timeout: A response to the request has not been received in a suitable amount of time. The requesting node MAY resend the request at a later time. # not clear when this will ever be used. Which node should send # this error message? All RELOAD messages MUST be signed. Intermediate nodes do not verify signatures. Upon receipt (and fragment reassembly if needed) the destination node MUST verify the signature and the authorizing certificate. If the signature fails, the implementation SHOULD simply drop the message and MUST NOT process it. This check provides # What happens if none {0,0} is given? Then no Node-ID is present... sec. 6.4.2. =========== What happens if these messages (like Join, Leave) are accidentally sent to a Client? Do the send an Invalid Message back? A new peer (but one that already has credentials) uses the JoinReq message to join the overlay. The JoinReq is sent to the responsible peer depending on the routing mechanism described in the topology Is the destination address now the Resource-ID or the Node-ID of the "responsible peer"? Because joins may only be executed between nodes which are directly adjacent, receiving peers MUST verify that any JoinReq they receive # 1.) should be peers rather than nodes # 2.) directly adjacent means here: directly adjacent in the overlay # topology (could otherwise be misunderstood as being directly # connected) # 3.) what must happen if the verification fails? adjacent, receiving peers MUST verify that any LeaveReq they receive arrives from a transport channel that is bound to the Node-ID to be # what happens if that verification fails? old: assumed by the leaving peer.) This also prevents replay attacks new: assumed by the leaving peer. This also prevents replay attacks # Typo fix sec. 6.4.2.3 ============ the state change. In general, peers send Update messages to all their adjacencies whenever they detect a topology shift. # A hint to the Connection Table and Clients would clarify sec. 6.4.2.4. ============= old: X. A RouteQuery can also request that the receiving peer initiate an new: X. A RouteQuery can also request that the receiving peer initiates an old: One important use of the RouteQuery request is to support iterative routing. The sender selects one of the peers in its routing table # add reference to Section 3.3. sec. 6.4.2.4.1 ============== destination The destination which the requester is interested in. This may be any valid destination object, including a Node-ID, opaque ID, or Resource-ID. # Does opaque ID make sense here? sec. 6.5.1 ========== A node sends an Attach request when it wishes to establish a direct TCP or UDP connection to another node for the purpose of sending # TCP/TLS or DTLS? old: node A has Attached to node B, but not received any Updates from B, new: node A has attached to node B, but not received any Updates from B, # Typo channel but MUST NOT route messages through B to other peers via that channel. The process of Attaching is separate from the process of # Is that also true for clients? sec. 6.5.1.1 ============ old: } AttachReqAns; The values contained in AttachReqAns are: new: } AttachReq; The values contained in AttachReq are: old: A single AttachReqAns MUST NOT include both candidates whose new: A single AttachReq MUST NOT include both candidates whose # consistency! sec. 6.5.1.2. ============= old: 6.5.1.2. Response Definition new: 6.5.1.2. Response Definition The AttachAns message hast the same format as the AttachReq message. #s/AttachReqAns/AttachAns/ in the whole paragraph. sec. 6.5.1.3. ============= An agent follows the ICE specification as described in [RFC5245] with # agent was not defined in the RELOAD context so far. sec. 6.5.4.2. ============= old: o The configuration document is correctly digitally signed (see Section 11 for details on signatures. new: o The configuration document is correctly digitally signed (see Section 11 for details on signatures). old: one listed in the current configuration file). Details on kind- signer field in the configuration file is described in Section 11.1. new: one listed in the current configuration file). Details on kind- signer field in the configuration file are described in Section 11.1. # typos sec. 6.6.2 ========== old: Each connection has it own sequence number space. Initially the new: Each connection and direction has it own sequence number space. Initially the sec. 6.6.3.1 ============ A node MUST NOT have more than one unacknowledged message on the DTLS connection at a time. Note that because retransmissions of the same message are given new sequence numbers, there may be multiple unacknowledged sequence numbers in use. # Since retransmissions violate the first sentence, it may be better # to use: old: A node MUST NOT have more than one unacknowledged message on the DTLS connection at a time. Note that because retransmissions of the same new: A node MUST NOT have more than one unacknowledged message on the DTLS connection at a time, except for retransmissions. Note that because from the routing table. The link MAY be restored to the routing table if ACKs resume before the connection is closed, as described # this should read Connection Table twice? sec. 6.7 ========= and fragments it, each fragment has a full copy of the Forwarding # 1.) clarification would be good that the length field is covering the # total msg length and which field are different in the "copies" # (at least the fragment field) # 2.) what happens if overlapping fragments are received? [**CONT**] sec. 7 ====== data may be stored in a single transaction, rather than querying for the value of a counter before the actual store. # a reader may wonder which counter is meant here - slightly confusing. # I assume that the text assumes a hypothetical different design that # uses counters as an alternative to timestamps? If a node attempting to store new data in response to a user request (rather than as an overlay maintenance operation such as occurs when healing the overlay from a partition) is rejected with an Error_Data_Too_Old error, the node MAY elect to perform its store using a storage_time that increments the value used with the previous store. # I don't understand what "using a storage_time that increments the # value used with the previous store" actually means. In this case # it is assumed that the requesting node has already the storage_time # of the previous store available or must it send a StatReq first? # In the former case it could simply check if localtime > storage_time # before sending the request. By what amount should the value be incremented? sec. 7.4 ======== old: o Store values in the overlay o Fetch values from the overlay o Stat: get metadata about values in the overlay o Find the values stored at an individual peer new: o Store: store values in the overlay o Fetch: get values from the overlay o Stat: get metadata about values in the overlay o Find: get values stored at an individual peer # just consistency sec. 7.4.1.1 ============ which represents a sequence of stored values for a given Kind. The same Kind-ID MUST NOT be used twice in a given store request. Each # what is the proper reaction if this condition is violated by a sending node? replica_number The number of this replica. When a storing peer saves replicas to other peers each peer is assigned a replica number starting from 1 and sent in the Store message. This field is set to 0 when a node is storing its own data. This allows peers to distinguish replica writes from original writes. # must it be "When a storing (responsible) peer saves replicas to"? # IMHO this description is too vague, since # as stated in 2) at the beginning of this mail it is not clear # how to assign the replica numbers when storing replicas. # Let's assume responsible peer X stores replicas at neighbors # A and C: gets A replica number 1 and C replica number 2? # if node B replaces neighbor C, does B get replica number 3 or number 2? # if the responsible peer X is replaced by a new node Y, Y will get # get the data from peer X, but how should X change its replica number? # Can Y simply start sending replicas beginning at 1? # I couldn't find any place in the document where it would make a # difference between replica number 2 and replica number 42 old: kind The Kind-ID. Implementations MUST reject requests corresponding to unknown Kinds. new: kind The Kind-ID. Implementations MUST reject (Error_Unknown_Kind) requests corresponding to unknown Kinds. # hint on the correct error message for this check old: values The value or values to be stored. This may contain one or more stored_data values depending on the data model associated with new: values The value or values to be stored. This may contain one or more StoredData values depending on the data model associated with # consistency old: o The signatures over each individual data element (if any) are new: o The signatures over each individual StoredData element (if any) are # consistency o For original (non-replica) stores, the peer MUST check that if the generation counter is non-zero, it equals the current value of the generation counter for this Kind. This feature allows the generation counter to be used in a way similar to the HTTP Etag feature. # what is the proper reaction in case the check fails? A reference to the # mentioned HTTP Etag feature would also be nice. o The storage time values are greater than that of any value which would be replaced by this Store. o The size and number of the stored values is consistent with the limits specified in the overlay configuration. # what is the reaction in case the checks fail? If all these checks succeed, the peer MUST attempt to store the data # this seems to suggest that the list is complete, though more checks are # describe in the following section 7.4.1.2. Maybe it's more consistent # to have them described in one place. sec.7.4.1.2. ============= replicas The list of other peers at which the data was/will be replicated. In overlays and applications where the responsible peer is intended to store redundant copies, this allows the storing node to independently verify that the replicas have in fact been stored. It does this verification by using the Stat method (see Section 7.4.3). Note that the storing node is not required to perform this verification. # what is meant by the term "storing node"? Is this the responsible node # who actually stores the data or is it the node who originated the # StoreReq? This may be inconsistent with the rest of the document. # Moreover, it is not clear that this implies a certain order. Must # the responsible node having successfully finished storing the replicas # before returning the replicas? If any type of request tries to access a data Kind that the peer does not know about, an Error_Unknown_Kind MUST be generated. The # The reaction is defined differently for StoreReq, FetchReq, StatReq # and FindReq: # - StoreReq: send back Error_Unknown_Kind # - FetchReq: "Implementations SHOULD reject requests corresponding to # unknown Kinds unless specifically configured otherwise." # -> what means "reject" exactly? # - StatReq: no hint given how to behave # - FindReq: "If a Kind-ID is not known, then the corresponding Resource-ID MUST be 0." -> Cannot distinguish from the case where a Resource-ID is not known: # "The closest Resource-ID to the specified Resource-ID. This is 0 # if no Resource-ID is known." # # Is the intention to trigger Configuration update requests? # cf. StoreReq: A node which # receives this error MUST generate a ConfigUpdate message which # contains the appropriate Kind definition (assuming that in fact a # Kind was used which was defined in the configuration document). sec. 7.4.1.3 ============= remove a value, the owner stores a new DataValue with "exists" set to False: exists = False value = {} (0 length) # what happens if a value is given? # Ignore the value or reject with an error invalid message? sec. 7.4.2.2. ============= old: uint64 generation; new: uint64 generation_counter; and old: generation the generation counter for this Kind. new: generation_counter the generation counter for this Kind. # consistency sec. 7.4.4.1: ============= kinds The desired Kind-IDs. Each value MUST only appear once, and if not the request MUST be rejected with an error. # What is the error code? sec. 7.4.4.2: ============= The response is simply a series of FindKindData elements, one per Kind, concatenated end-to-end. The contents of each element are: # unclear what end-to-end means here? sec. 10 ======= protocol. A short list of differences: # A further difference is the fact that CHORD uses hashes of IP addresses # while in RELOAD nodes get assigned Node-IDs and the routing table # does not contain "underlay addresses" sec. 10.1 ========= old: and n+1, with all arithmetic being done modulo 2^{k}, where k is the length of the Node-ID in bits, so that node 2^{k} - 1 is directly new: and n+1, with all arithmetic being done modulo 2^(k), where k is the length of the Node-ID in bits, so that node 2^(k) - 1 is directly # consistency with other formulas... sec. 10.3 ========= If a peer is not responsible for a Resource-ID k, but is directly connected to a node with Node-ID k, then it MUST route the message to # what means "directly connected" here? Is it an entry in the ConnectionTable # or RoutingTable? I guess the former in order to cover routing to Clients sec. 10.5 ========= 1. JN MUST connect to its chosen bootstrap node. # it is unclear what "connect" exactly means. This is actually # described in Sec. 11.4: # "When contacting a bootstrap node, the joining node MUST first form # the DTLS or TLS connection to the bootstrap node and then sends an # Attach request over this connection with the destination Node-ID set # to the joining node's Node-ID." # So my suggestion is to explain it a little but more or to put a forward # reference to section 11.4. 2. JN SHOULD send an Attach request to the admitting peer (AP) for Node-ID n. The "send_update" flag can be used to acquire the routing table for AP. # This is inconsistent with Section 12 stating # "JN then sends an Attach through that peer to a resource ID of itself (JN)." # (Node ID n vs. Resource-ID n) # # Moreover, it is not clear how the JN finds the AP. This is explained in #Section 12: # to one of the bootstrap nodes. JN then sends an Attach through that # peer to a resource ID of itself (JN). It gets routed to the # admitting peer (AP) because JN is not yet part of the overlay. When # I suggest to change it to: 2. JN SHOULD send an Attach request to Resource-ID of itself (JN) in order to contact the Admitting peer (AP) for Node-ID n. The "send_update" flag can be used to acquire the routing table from AP. # Question: should it be "acquire the routing table of AP" or "acquire the # routing table from AP"? 3. JN SHOULD send Attach requests to initiate connections to each of the peers in the neighbor table as well as to the desired finger table entries. Note that this does not populate their routing tables, but only their connection tables, so JN will not get messages that it is expected to route to other nodes. # here it is unclear that the Attach requests are sent via the AP # and what "desired" finger table entries means, e.g., in contrast to all? # why are we having two SHOULDs instead of two MUSTs, since 10.7 states: # A peer MUST maintain an association (via Attach) to every member of # its neighbor set. A peer MUST attempt to maintain at least three In order to set up its i'th finger table entry, JN MUST send an Attach to peer n+2^(128-i). This will be routed to a peer in # instead of peer n+2^(128-i) better resource-ID n+2^(128-i)? # How should an AP react if it detects that a JN sent # the Join to the wrong AP (not responsible at all or not # anymore)? sec. 10.7. ========== A peer MUST maintain an association (via Attach) to every member of its neighbor set. A peer MUST attempt to maintain at least three # why only to the neighbor set and not the whole routing table? Sec. 10.7.1: ============ Every time a connection to a peer in the neighbor table is lost (as determined by connectivity pings or the failure of some request), the # I think that "connectivity pings" means periodically sent Pings to all # members of the Connection Table by the Link Management Layer as # mentioned in sec. 6.5. A reference may help here. # However, I couldn't find out how often such "connectivity pings" should # be sent (the chord-ping-interval is only related to pinging finger table # entries). # It is unclear to me why (sec. 10.7.1) the spec states: # "peer MUST remove the entry from its neighbor table and replace it # with the best match it has from the other peers in its routing table. # If using reactive recovery, it then sends an immediate Update to all # nodes in its Neighbor Table." # so the peer replaces the failed entry with one of its fingers, # which is maybe a poor choice, and sends this information to # its neighbors. IMHO it makes much more sense to request routing # information from the neighbors in order to get a better replacement # for the lost peer, instead of "confusing" the neighbors with more # inaccurate routing information. If the neighbor failure affects the peer's range of responsible IDs, then the Update MUST be sent to all nodes in its Connection Table. # a short hint that it's the Connection Table instead of the Neighbor Table # due to Clients would be nice If connectivity is lost to all successor peers in the neighbor table, then this peer SHOULD behave as if it is joining the network and MUST use Pings to find a peer and send it a Join. If connectivity is lost # Only Join or Attach and Join? Sec. 10.7.2.: ============= If a finger table entry is found to have failed, all references to # how is it determined to have "failed"? 10.7.1 is only # related to neighbor failures...does the same definition # apply here? Sec. 10.7.3.: ============= If the neighbor failure affects the peer's range of responsible IDs, then the Update MUST be sent to all nodes in its Connection Table. # add hint that it's Connection Table due to Clients interval" element (denominated in seconds.) A peer SHOULD randomly offset these Update requests so they do not occur all at once. # This is a little bit too imprecise. # How should this be done? A typical mechanism # is using a randomly chosen offset from an interval [0.5*Tp,1.5*Tp] # where Tp is the target period, cf. # S. Floyd, V. Jacobson, „The Synchronization of Periodic Routing # Messages“, IEEE/ACM Transactions on Networking, Vol. 2, No. 2, April, # 1994, http://portal.acm.org/citation.cfm?id=187045 Sec: 10.7.4.2.: ============== A peer SHOULD NOT send Ping requests looking for new finger table entries more often than the configuration element "chord-ping- interval", which defaults to 3600 seconds (one per hour). # This paragraph should probably moved some paragraphs down as it # has been stated yet that a peer should actually send Ping requests # at all (this is explained in the subsequent paragraphs). Sec 12: ======== # throughout the figures: "Update" should be "UpdateReq" # and Attach should be AttachReq? # Otherwise it's not consistent with the use of Attach vs. AttachReq/AttachAns Neighbor Table and Connection Table are written sometimes in lower case and sometimes in upper case style. I'm sure that the RFC editor will ask for consistency in writing... That's all for now... Regards, Roland