Hi Thomas, Thanks for the careful reading and review. I think we can deal with all your comments without difficulty. Just two possible discussion points in line below. Regards Brian On 07-Dec-21 03:58, Thomas Fossati via Datatracker wrote:
Reviewer: Thomas Fossati Review result: Ready with Issues I am the assigned Gen-ART reviewer for this draft. The General Area Review Team (Gen-ART) reviews all IETF documents being processed by the IESG for the IETF Chair. Please treat these comments just like any other last call comments. For more information, please see the FAQ at <https://trac.ietf.org/trac/gen/wiki/GenArtfaq>. Document: draft-ietf-anima-asa-guidelines-?? Reviewer: Thomas Fossati Review Date: 2021-12-06 IETF LC End Date: 2021-12-13 IESG Telechat date: Not scheduled for a telechat Summary: The document contains guidance for building ASAs. It discusses different kinds of requirements and their impact on the software architecture. It looks like an useful doc to have. In general, the document reads very well, with the exception of Section 6 - see "Minor issues" below. Major issues: Minor issues: In Section 6.3, I have followed the reference to draft-peloso-anima-autonomic-function and I noticed that the content of Section 6 has been transplanted nearly as-is from there. So, to avoid redundancy, I wonder whether that content should be given the same treatment as you do in Section 7 WRT draft-ciavaglia-anima-coordination? Or maybe you want to re-think the approach and have Section 7 do similar copy&paste from draft-ciavaglia-anima-coordination? They are both individual and expired draft after all so it's probably better doing the latter. I also wonder whether it is worth to spell out explicitly the fact that, given ASAs may need to co-exist with the actual networking application, they should be build to require minimal memory footprint &, in general, use system resources with parsimony. A related question is whether ASAs require dedicated system resources in order to continue operating in a busy system?
Generally we expect that ASAs will run at a much lower frequency than any "production" workload in the node, so CPU load should not be a big issue, but memory footprint in a constrained node is certainly a concern. We tend to assume that ASAs will be mainly installed in non-constrained devices, or that if they are in a constrained device, they'll have a subset of functionality. Officially, we punted on that issue - RFC8993 says "At a later stage, the ANIMA Working Group may define a scope for constrained nodes with a reduced ANI and well- defined minimal functionality."
Nits/editorial comments: Section 2. * Repeatedly flood an objective to the AN, so that any ASA can Expand "AN" on first use. These threads should all either exit after their job is done, or enter a wait state for new work, to avoid blocking others unnecessarily. "blocking others unnecessarily" is not what would typically happen, maybe "to avoid wasting system resources" ? [...] It should also do whatever is required to avoid unnecessary resource consumption, such as including an arbitrary wait time in each cycle of the main loop. I am not sure what "arbitrary wait time" refers to? Is it a "sleep(n)" at the end of each iteration of the main loop? I think it's the parsimony principle what you want to highlight here, and the first part of the sentence is sufficient for capturing that without going into concrete examples. Section 3.3 This API is intended to support the various interactions expected between most ASAs, such as the interactions outlined in Section 2. However, if ASAs require additional communication between themselves, they can do so using any desired protocol, even just a TLS session if that meets their needs. One option is to use GRASP discovery and What is the meaning of "just" in "just a TLS session"? Also it's not clear what kind of messages would flow through this additional channel and if there are any requirements in terms of their security properties. [...] As noted above, the ACP can secure such communications, unless there is a good reason to do otherwise. Maybe s/can/should/ and drop "unless ... otherwise"? Section 6.1.1. The typography used here to define inputs is a bit odd. And in general the whole section probably needs some more attention from an editorial point of view. Section 6.2 the agent piece of code (when this does not start automatically) and Maybe drop "piece of". Section 6.2.1 The operator's goal can be summarized in an instruction to the ANIMA ecosystem matching the following format: [instances of ASAs of a given type] ready to control [Instantiation_target_Infrastructure] with [Instantiation_target_parameters] Maybe better to move this at the beginning of Section 6.2.2. Section 6.2.3 As in Section 6.1.1., the typographic style used here is a bit odd / unconventional. Section 6.3 Note: This section is to be further developed in future revisions of the document, especially the implications on the design of ASAs. Is this note still valid? (I hope not :-) ) Section 10 of robustness that ASA designers should consider Maybe stick a colon at the end of the line. 1. If despite all precautions, an ASA does encounter a fatal error, it should in any case restart automatically and try again. To mitigate a hard loop in case of persistent failure, a suitable Terminology: what do you mean by "hard loop"? 8. On the other hand, the definitions of GRASP objectives are very likely to be extended, using the flexibility of CBOR or JSON. Therefore, ASAs should be able to deal gracefully with unknown components within the values of objectives. Is this in line with Section 6 of draft-iab-protocol-maintenance? I.e., has GRASP clearly defined extensibility rules, or is this a call for the ASA implementation to apply the robustness principle? At a slightly more general level, ASAs are not services in themselves, but they automate services. This has a fundamental impact on how to design robust ASAs. In general, when an ASA observes a particular state [1] of operations of the services/ "[1]" looks like a bib reference, please consider using an alternative typography, e.g., "(1)", or "A" Section 11 ASAs are intended to run in an environment that is protected by the Autonomic Control Plane [RFC8994], admission to which depends on an initial secure bootstrap process such as [RFC8995]. s/such as BRSKI [RFC8995]/ In particular, they must use secure techniques and carefully validate any incoming information. "secure techniques" could be unpacked a bit, for example: "secure coding practices" (e.g., input validation, least privilege, etc.), "secure configuration practices" (e.g., default deny). Appendix C An implementation requirement is that resource pools are kept in stable storage. Otherwise, if a delegator exits for any reason, all the resources it has obtained or delegated are lost. If an origin exits, its entire spare pool is lost. The logic for using stable storage and for crash recovery is not included in the pseudocode below. Is there a further requirement for the storage to be shared across all ASAs? What I am wondering is whether a shared global map of the current resource allocations exists to help reconstructing a partitioned topology (in case one ASA disappears)? Or is the delegated resource recall, in case the ASA delegator fails, handled by GRASP?
I think the answer depends on the resource. For the one that we fully defined (IP address prefixes, RFC8992) there certainly needs to be a solid logging and recovery mechanism, as there is for traditional APAM systems. Since GRASP operations are not intrinsically idempotent, that must be done by the ASAs. I don't think it can be a single global map, because it has to survive network partition and reconnection. The global map could be constructed if necessary from the log in each ASA. On the other hand, if the resource being shared is upstream network capacity from a given router, which is shared among many downstream routers, there is no need for a global map. -- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call