Hi Ted,
Please see inline.
Cheers, Med De : Ted Hardie [mailto:ted.ietf@xxxxxxxxx]
wing are missing from the document:
Sorry, I thought you were asking what wgs or protocols planned to reference this. For that, I don't know. [Med] OK. IMHO lacking such considerations, there is a high risk that the advice will be lost or that it can be used as a permanent DISCUSS point in later
stages of preparing documents. I’d prefer if actionable points to be considered by WGs and document authors in early stages.
The intent is that it is information useful to those considering whether restoring metadata lost to encryption in mid-network is the right way to go. [Med] This is another assumption in the document that I disagree with: It seems that you assume that an on-path device, that inserts metadata, is necessarily
RESTORING back that information. This is not true for many efforts: ·
A Forward-For header inserted by a proxy does not restore any data; it does only reveal data that is already present in the packet issued by the
client itself. ·
An address sharing device, under for example DS-Lite (RFC6333), that inserts the source IPv6 prefix in the TCP HOST_ID option (RFC7974) is not RESTORING
any data. The content of that TCP option is already visible in the packet sent by the host.
·
Service Function Chaining WG (https://datatracker.ietf.org/wg/sfc/about/) is defining an
architecture to communicate metadata by on-path devices; that metadata is inserted at the network side. Border nodes will make sure that data is stripped before forwarding packets to the ultimate destinations. The metadata can be a subscriber-id, a policy-id,
etc. So when draft-hardie-* says: “Do not add metadata to flows at intermediary devices unless a positive affirmation of approval for restoration has been received from the actor whose data will be added.” (1) Do you assume that the sample examples I listed above fall under your advice? (2) How an on-path device will know the data it intends to insert is a “restoration”? (3) Does it mean that for new data (i.e., that are not restoration), on-path devices are free to do whatever they want? For me,
this is undesirable. There is a void there. A statement to require those networks to avoid leaking privacy information must be included.
Another assumption is made here:
Instead, design the protocol so that the actor can add such metadata themselves so that it flows end-to-end, rather than requiring the action of other parties. In addition to improving privacy, this approach ensures consistent availability between the communicating parties, no matter what path is taken. This text claims that providing data by the endpoint ensures a “consistent availability” of that information. This is broken for a multi-homed host that uses for example
Forward-For header: Obviously, the content of the header if injected by the endpoint will depend on the path. A way to ensure a “consistent availability” is to insert many Forward-For headers; each enclosing the content that is specific to a given network
attachment. But doing that raises a privacy concern because the remote server can track clients.
Sorry, did you mean "do not think we need more"? [Med] I meant we need more than only highlighting the issue. We need something which is actionable. Requiring a Privacy Section in every RFC may a direction
to consider. If so, I obviously disagree. This design pattern is used uncritically enough that a brief document describing why it isn't safe still seems to me useful. Were it incorporated into a more general document (as
noted before), that would also work. If it later is, that more general work could obsolete this (though that's a bid for an informational document).
There are certainly some protocol designers that have internalized this, but my experience has been that this is not always the case. In a fair few cases, folks deploy methods like this because they see encryption of metadata in data
integrity terms or see aggregation only in terms of data usage minimization. They restore the metadata mid-network because it is the quickest solution for them to get back to the status quo ante for their understanding of the system. [Med] I hear you. What would be the harm if those solutions strip that information before sending it to the server? If they don’t strip it, this means that
either the information can be parsed and used by the server, or at least its presence does not lead to session failures. In the case the server parses and uses that information, this means that the presence of that information is important for the service
to deliver. In that case, the question is why the client does not supply that information at the first place.
You are certainly correct that many deployed protocols would find it hard to retrofit this consent model into their existing flows. This is, however, advice for folks at the design phase. If RFC 6788 were
being written after the publication of this document, its authors might well have looked at the protocol mechanics in section 5.2: The AN intercepts and then tunnels the received Router Solicitation in a newly created IPv6 datagram with the Line-Identification Option (LIO). The AN forms a new IPv6 datagram whose payload is the received Router Solicitation message as described in [RFC2473], except that the Hop Limit field of the Router Solicitation message MUST NOT be decremented. and asked whether the circuit identifier corresponding to the logical I don't know, frankly, which choice is right in this case, but I would prefer that
Okay, how about the following text being added to section 5. There also tensions with latency of operation. For example, where the end system does not initially know the information which would be added by on-path devices, it must engage the protocol mechanisms to determine
it. Determining a public IP address to include in a locally supplied header might require a STUN exchange, and the additional latency of this exchange discourages deployment of host-based solutions. To minimize this latency, engaging those mechanisms may
need to be done in parallel with or in advance of the core protocol exchanges with which this metadata would be supplied. [Med] Looks good to me. Thanks.
I don't think that emergency service recipients shifting to an example works here, because it broadens the carve out. In the emergency services case, the resources consumed are fire trucks, ambulances, and swat teams.
For other servers, resources consumed could simply be CPU cycles or disk; that's really not the same. Balancing location consent requirements against one was agreed; balancing it against the other was not. [Med] Resources may not be restricted to CPU or disk but may be granting access to the service (e.g., download a file when a quota per source address is enforced).
It can be whatever the servers consider to be critical for them; it is up to the taste of the service design to characterize it. The NEW wording proposed above is technically correct. Please reconsider adding it to the draft.
I agree that it has its own privacy risks, but I don't think this is the document that should explore them.
[Med] You don’t need to explore them, but to add one or two sentences to remind that privacy leaks are still a valid concern even if only clients are supplying
data without the help of an on-path network device.
Broadening this a bit, you're looking at two cases: one in which the data the host has is wrong and one in which there is an adversarial relationship. For the first case, we can add text saying that when an end system supplies data it
is the end system's responsibility to ensure that it is correct; don't use a STUN result from last week as fresh, for example. [Med] OK. For the second case, in which the server treats user supplied data as potentially misleading because the user may wish to circumvent restrictions, I'll point out the Wikimedia example demonstrates that simply shifting
the trust to a mid-point entity doesn't work; it has to be shifted to an entity within the trust domain of the server.
So the question isn't really "end-user system supplied data can be trusted or not", the same question applies to whomever supplies the data. [Med] Fully agree. Having some text to record that the concern applies, including for client supplied data.
[Med] Please add some text about this point. Thank you.
The deployment considerations text is meant to point out the engineering balance. I'm happy to add the text noted above (on latency, the end user responsibility for correct data, the PSAP carve out, and the
explicit note that the document does not treat how to obtain consent from a user so that an end system can supply data).
[Med] Ok, thanks. I'm less happy to add language on adversarial treatment of client-supplied data. This is partly because many of the systems which use network-supplied data are based on a misunderstanding of the properties
of the data being added. [Med] I agree this may be the case for some of them, but not all.
It is partly because the adversarial relationship can extend to network-supplied data.
It is also because a fair few of them are simply security theater. If you have a specific edit you would like to propose, though, I will consider it. Thanks again, Ted |