I am using the side by side feature to review the differences between draft-13 and draft-15. I am commenting here on Kathleen's message, to check whether
various points are addressed. On 1/5/2018 6:07 AM, Kathleen Moriarty
wrote:
OK, waiting for the next draft then.Christian, Thanks for your review. I'll respond inline to make sure we hit each point raised. The next version posted may not address your points, but the subsequent update should and I expect to have that out soon. On Fri, Dec 8, 2017 at 11:25 AM, Kathleen Moriarty <kathleen.moriarty.ietf@xxxxxxxxx> wrote:Thank you, Christain and others for your reviews. Our running draft has addressed most comments received by mid-IETF week. We are working on the others. We also received some comments from Kyle Rose and Brandon Williams and are working to address those as well. We'll respond on list when we get to this set of comments. Best regards, Kathleen On Mon, Dec 4, 2017 at 11:23 AM, Christian Huitema <huitema@xxxxxxxxxxx> wrote:The high level summary is that draft-mm-wg-effect-encrypt version 13 is significantly improved from previous versions, but that the document would benefit a lot from additional work. I am not convinced that the sections on data center and enterprises belong in this specific document - they seem too high level to bring serious information to readers. Maybe stage a separate document to survey enterprise issues in some depth? I also feel that section 7 is way too speculative for a survey document.Section 7 was previously an appendix and maybe belongs back as an appendix, we need to figure that out. There is one particular case that a commenter saw the value of access (where he didn't in most places) that we may need to figure out how to keep that in the body as helpful text. Since the comments on data center and enterprises is opinion, I'll leave the text in for now. Thanks....2. Network Service Provider Monitoring Section 2, "Network Service Provider Monitoring", has been reorganized to focus on management goals rather than simply provide a list of existing management tools. The description of the trouble shooting tasks in section 2.2 is useful. It makes the point that "application server operators using increased encryption should expect to be called upon more frequently to assist with debugging and troubleshooting", and that could lead to some interesting work in the IETF. There is paragraph at the end of section 2.1.2, Troubleshooting, that states that "the push for encryption by application providers has been motivated by the application of the described techniques." I think that paragraph is misplaced. As far as I can tell, the application providers are first concerned with "content management" techniques that modify the data stream. Any change of content has the potential to generate bugs that are difficult for the application provider to fix. The second concern is "ossification", when traffic characterization based on inferred features of the application traffic leads to adverse consequences when the application or transport protocols evolve. Neither of those is directly relevant to the "troubleshooting" task. Maybe move that paragraph higher in the document, e.g. in the introduction of section 2? If not that, then maybe move it to section 2.2.2, since one purpose of application encryption is indeed to defeat differential treatment in the network.I see your point, but am hesitant to move the text up since there is no other general description text in that section heading and this just covers applications. I want to comb through all of section 2 again after this sweep of comments, so we may do a bit more to improve this as well within a wider scope of organizational changes. A new paragraph got inserted at the end of section 2.1: "Vendors must be aware that in order for operators to better troubleshoot and manage networks with increasing amounts of encrypted traffic, ...". That paragraph appears to use normative text, "must be enhanced", and loose keywords like "reveal cleartext network parameters". I don't much like the idea of a "diagnostic" document making specific recommendations about the actual remedy. Also, "vendors" is a strange word there, as it normally implies a customer-vendor relation. AFAIK, there is no such relation between application service providers like Netflix or Google and the managers of specific networks. I get the idea that troubleshooting is a distributed task, that application providers could help network troubleshooting by providing better tools, and that doing so would be in everybody's best interest. But that can be said in a much simpler way than this additional paragraph, without implying non-existing contractual relations, and without mandating any specific outcome to the future discussions. I am reading draft-15, and this section is still very confusing. For example, the text says that "Mobile operators deploy integrated load balancers to assist with maintaining connection state as devices migrate. With the proliferation of mobile connected devices, there is an acute need for connection-oriented protocols that maintain connections after a network migration by an endpoint." Yet the definition at the beginning of the section mentions "integrated load balancers" as "integral part of the service provided by the server pool behind that load balancer". To me, "integral part of the service" implies that the load balancer and the server pool as managed by the same entity, which is providing the application service. Yet, the discussion after that seems to imply something rather different -- a middlebox deployed by the network provider without coordination with the application service provider.I find the discussion of load balancers in section 2.2.1 somewhat confusing. It seems to cover three functions: load balancers in data centers, load balancers integrated with the network, and a network management function that tries to maintain proper connectivity to anycast addresses services in the presence of mobility. It might be useful to move the discussion of "classic" load balancers to section 3, and to discuss the problem of anycast continuity in a separate subsection. The anycast discussion seems to assume that the network operator alone has to deal with the supposed inadequacies of the application providers. It seems obvious that this problem would be much better solved by improved handling of mobility in content distribution networks, rather than by some complex machinery in the network itself. This might need to be stated. I would really like to see that text rewritten. And I stand by my original comment regarding use of anycast by CDN providers. The current text assumes some unilateral action by the network provider. It would be much more productive to explain why anycast can be problematic in high mobility environments, and suggest a future discussion on how to resolve the problem. For example, TCPM and QUIC provide support for connection migration, which would probably yield much better results than guesswork in the middle of the network. And now it does. Thanks.Section 2.2.2 on Deep Packet Inspection could state that it is very often possible to classify traffic based on analysis of the encrypted data. Audio stream, video streams and web traffic have very different signatures, even when encrypted. At the same time, it should also note that many application providers are actively working to defeat the "unilateral" traffic classification enabled by these techniques, complementing encryption with various techniques like multiplexing or padding. We could well observe an arms race between more powerful network based analysis and smarter application hiding. The discussion of performance enhancing proxies in section 2.2.3 states that "This optimization at network edges measurably improves real-time transmission over long delay Internet paths or networks with large capacity-variation (such as mobile/cellular networks)." This is not a consensual statement. Operators do indeed hope that deploying such proxies will improve performance, but independent measurements have shown that such proxies often in fact degrade performance. The studies that show improvement tend to be based on old network technologies, or on ancient TCP stacks. If the authors want to keep a statement like that, they should add references to actual measurements. At a minimum, the text should note that many application providers disagree with the assessment presented here, and that the development of encrypted transports such as QUIC is largely motivated by the desire to mitigate the negative effects of such "performance-dehancing" proxies.Do you have references to the studies you are referring to as that would be helpful. Thanks. I am searching for references. But I can propose text: Not everybody agrees that "performance enhancing proxies" actually enhance performance in the long term. Proxies are typically measured against a nominal version of the transport protocol, and may fix issues encountered in that protocol version in a specific environment. On the other hand, by "taking responsibility" for the transport protocol, proxies eschew the future benefits of transport protocol improvement in the endpoints. For example, there is active research on better congestion protocols, or better error recovery algorithms, including algorithms better adapted to wireless networks. The resulting improvements can be quickly deployed by means of system updates, or even application updates in the case of transport protocols like QUIC. Experience shows that network devices are not updated with the same frequency. An up-to-date endpoint would have benefited from the protocol updates, but would be stuck with the legacy performance if it must still use the legacy "optimizer". The discussion of caching in section 2.2.5 correctly states the tension between network usage and application control. It could also state the inherent privacy risk associated with network based caches: they will provide a log of which users accessed what cached content. There is a reference to draft-thomson-http-bc-01, but as far as I know the authors have abandoned it, in part because they could not solve the related privacy issues. In any case, that draft expired several month ago, and the reference is probably not appropriate.Other comments led to that text being updated for the next version. The comments received took this as one of many examples, perhaps there is a better example that could be used instead of this draft? Alternate approaches such as blind caches <xref target="I-D.thomson-http-bc"/> are being explored to allow caching of encrypted content; however, they still require cooperation between the content owners or CDNs and blind caches and fall outside the scope of what is covered in this document. Content delegation solves a data visibility problem with the delegated cache, the impact remains for the use case where HTTPS encryption limits visibility to offload from congested links. This was addressed in a separate thread, thanks. But now, we get additional text in section 2.2.6 on content compression, and a new section 2.2.7 on service function chaining. The text in 2.2.6 basically says that when the network sees the same segments used on multiple connections, it can compress them efficiently. That's plausible, but just like in section 2.2.5 we have to note that there is an explicit trade-off between compression and privacy. The "segments" that could be compressed in 2.2.6 are pretty much the same as the segments that could be cached in 2.2.5. Exposing them to the network for compression has pretty much the same effect on privacy as exposing them to the network for caching. The same caveats should apply. The new section 2.2.7 describes "service function chaining", which is an architecture for distributed implementation of services. We should separate there the management function proper, which would distribute functions already provided by the network provider, and a future expansion of the network that would provide services currently offered by application providers. Encryption actually marks a nice delimitation between these two notions. Network providers may well convince some application providers to make greater use of their "service functions", but that's a business negotiation, not an architecture discussion. If and when application providers decide to subscribe to the function provided by the network, these application providers would devise ways to expose the required data. Or then, maybe not. I would suggest writing the section 2.2.7 in a rather different way. Section 2.2 already discusses various functions that could be provided by the network, or by the application providers, or by third parties like CDN. We can not the trend to organize these functions under the "service function chaining" architecture, and the general limitation that if the functions are impacted by encryption in the current architecture, they will remain similarly impacted after the network provider adopts a "service function chaining" architecture. And leave it at that. "The deployment of IPv6 may well reduce the need for NAT, and the corresponding requirement for Application Layer Gateways."In section 2.3.3, Application Layer Gateways, I was wishing it would say something about IPv6. But then of course most IPv6 deployments today involve a form of NAT64...If you have text to suggest, we'd be happy to incorporate it. s/sometimes/often/?Section 2.3.4 documents the "HTTP Header Insertion" technique. The relation between that technique and "Network Service Provider Monitoring" is unclear -- header insertion is certainly not a network monitoring tool. It is also a highly controversial tool, as documented for example in https://www.theverge.com/2016/3/7/11173010/verizon-supercookie-fine-1-3-million-fcc. I wonder whether it is appropriate to describe this at all in a document dedicated to network management, and my simple suggestion would be to just remove that section altogether. Failing that, the text needs to be modified to note the controversial nature of the process, and its impact on privacy. The authors could also note that the function could be trivially implemented in the client's browsers if it was really needed and approved by the users. There is no technical need to have anything like that "in the network".I see your point, but operators wanted this text in, so how about a modification to the last sentence to try to balance it out more? As I read the text, it describes invasive uses, so I thought that it was clearly something the was not without controversy. When HTTP connections are encrypted to protect users privacy, mobile network service providers cannot insert headers to accomplish the, sometimes considered controversial, functions above. I think these sections are weak, and do not bring much to the readers. But I said that already.3. Encryption in Hosting SP Environments After examining network monitoring in section 2, the draft continues with an analysis of "Hosting SP Environments" in section 3, and section 4 describes "Encryption for Enterprises". I assume that the initials SP stand for "Service Provider" -- spelling it out would not hurt. I really wonder whether these sections belong in the document at all, rather than being published in separate documents. "Hosting Service Provider Environments" appears to be a subset of the general "Data Center" problem. It is true that some network providers also provide data center services for their customers, but these network providers represent only a small fraction of the service hosting industry. Similarly, some network providers provide services to enterprises, but there is a wide variety of enterprises. It is hard to believe that the authors of an individual draft have authority to speak at the same time about network services, data centers, and enterprises. In my opinion, it would be simpler to just excise section 3 and 4 from this draft, and use the content as input for specific drafts describing issues in data centers and enterprises.I'd prefer to keep these sections separate and think others feel the same way. Hosted data centers can occur in several layers as well. I work for a very large company that provides hosted infrastructure as a service, where others offer hybrid cloud options and other outsourcing options (application service providers, etc.) on top of this layer. It seems you may be thinking of this as a higher level of data center offerings than what is deployed in industry. I'd prefer not to get into listing them out as we would certainly miss some and it could be a confusing list that doesn't help the point of the draft. 4 may not be just for enterprise outsourcing options as in the IAAS example provided. In any case, I am puzzled by the reference to Data Loss Prevention (DLP) in the introduction of section 3.1. Data exfiltration is indeed a security issue, but I knew it primarily as an issue in enterprise networks. It does indeed become an issue in data centers when an enterprise application is hosted outside the data center, but it is a bit strange to see the first reference there. I already suggested to move section 3 and 4 out to a different document. Failing that, I would suggest reversing the order of section 3 and 4, i.e., discuss enterprise issues first and data center issues next.I think it was contributed there as a result of offerings to customers. Before removing it, we'd have to get confirmation that it isn't a service provided to many enterprises with outsourced solutions. I am pretty sure I had confirmed this point previously as RSA offers multiple DLP solutions at several layers. With a quick glance, RSA has cloud and data center buzz words in their product offering descriptions in addition to network and endpoint. The new text is somewhat better. I wonder whether DLP would merit a separate subsection. Of course the really bad opponents started using encryption long ago, well before the IETF's push for privacy. This is not exactly a new problem. Also, I would expect them to use a variety of techniques to disguise and hide their data streams. Hacking routers comes to mind. The discussion of Customer Access Monitoring in section 3.1.1 is a bit strange. Most applications control customer access based on the customer identity, not based on the IP addresses of the customer -- the whole point of the "cloud" is that applications can be accessed from anywhere. Some applications do perform additional checks, mainly as a defense against stolen credentials, and would attempt to block access if the network location does not look plausible for this specific user. These are useful techniques, but the relation with encryption of data is somewhat thin. It seems to reinforce my point that data center issues would best be discussed in a separate document.There are certainly access restrictions based on IP and protocol information, which a 5-tuple and a 2-tuple are often adequate. If you are an administrator from a company who outsourced for infrastructure and some application support, management access of your customer application might be a simple example where this is still used. Sure, users are mobile, but they could VPN to their company network and connect from an approved IP. I think some of the use cases described are examples that are fine and encryption is a very small shift. In some of the examples, they serve as possible ways forward for the other use cases that haven't adapted yet. Boiling down the details and showing that log improvements and transaction monitoring improvements in the application or changes in protocols could ease this transition is important IMO. It's been a helpful process for some of the participants and I hope it helps for future protocol development to engage in these tough discussions understanding the perspectives on either end of this debate better.The reminder of section 3 appears to be a high level tutorial on the operation of data centers. It is not clear that there is a particular problem with encryption there. Indeed, I note that a lot of operators of big data centers, such as for example AWS, Azure or Google, have voluntarily pushed for increased used of encryption. I don't learn much by reading these sections, and I question whether they belong in the draft.I'll read through it again in a final sweep. We haven't gotten complaints about this text, but I'll note your concern in my sweep. In 3.2.1, you could be a bit more explicit about application logging, and application manageability in general. Also, maybe not say that "Application logging currently lacks detail..." Is that true of every application? In general, I agree that it would be a good thing to delineate the type of information that data center managers would like to find in logs. Of course, logging too much has its own issues, such as exposures to breaches of privacy or to lawsuits. It seems we need to have a robust discussion leading to some kind of best practice document. 4. Encryption for Enterprises The discussion on encryption in enterprises would probably benefit from input by a variety of enterprise network managers. I found the discussion somewhat hard to read. It seems that the authors want to tackle three issues: the enterprise as a target for security attacks, the enterprise as an application provider, and the enterprise as a network provider. These are discussed in sections 4.1.1, 4.1.2, and 4.1.3. The description of attacks in 4.1.1 is somewhat high level. It starts from the statement that "A significant portion of malware hides its activity within TLS or other encrypted protocols" to draw a requirement to monitor encrypted traffic, when in practice there are many other monitoring points, from endpoint monitoring to data base activity logs to logs at network authentication servers -- as stated in the last paragraph of the section. The monitoring of application performance in enterprises appears strangely focused on the "IPv6 Destination Option Header (DOH) implementation of Performance and Diagnostic Metrics (PDM)". I understand that most big applications solve their monitoring need by implementing some form of telemetry, which is not affected at all by encryption, yet I see no mention of this telemetry approach in the discussion.Different data centers operate with different architectures and approaches. We'd be happy to get text from other network operators. I'll also do a sweep of this section and see if I can update it a bit more to help along these points. It may come in an update following the next one. In 4.1.1 there is a discussion relative to BYOD and exfiltration. Seriously? I mean, if I want to exfiltrate data with my phone, I can simply copy the data on the phone, and then wait until I am outside the enterprise network to exfiltrate it. The solutions that I have seen working involve personalized watermarks in downloaded content and post facto retribution against the leakers. That is, solutions designed specifically for the problem, rather than massive fishing expeditions through encrypted data... I had a hard time reading section 4.1.3, Enterprise Network Diagnostics and Troubleshooting. It seems to cover a variety of techniques meant to monitor application services without actually instrumenting the application, and as such is not very convincing.Yes, the problem conveyed to us is that there are way too many applications that do a poor job of providing troubleshooting data, that it is hard to get this to an improved state. Since this was discussed a few years ago, I started putting comments on drafts to request text on logging to help in the long term. The organizations with these issues have also started reaching out to offending application developers, but it a long road. It is recognized that it's a better long term option. I'll see what we can do with the text on this front. OK. What don't you just say so? Proper monitoring of application behavior requires proper instrumentation of the application. When applications do not have an adequate instrumentation, managers often resort to network-based monitoring. This is problematic when the traffic is encrypted." My concern is that if an application needs to be updated to allow monitoring, it is much more beneficial in the long term to do that by better logging, rather than by somehow weakening the encryption. We should find a way to say that. The meat of section 4 appears to be in section 4.2, which is covering the issue of data loss prevention, and generally detection of data exfiltration. Again, this is an issue that would be worth a specialized draft. 5. Security Monitoring for Specific Attack Types Looks fine, and this review is already very long.Ack, thanks.6. Application-based Flow Information Visible to a Network Do we need this section at all? It seems that most of the information could be captured by adding a small subsection to 2.1. Passive Monitoring.IPFIX was added at the request of Benoit to more fully cover network management protocols in the document. Brian Trammell provided that text, so the next version will have an improved section 6.7. Impact on Mobility Network Optimizations and New Services This section appears to be a mix of replication of statements already made in section 2, and some speculation on the effect of transport header encryption, such as deployed in Web RTC (SCTP over DTLS) or planned in QUIC. There are active discussions in the QUIC WG to provide alternative to transport header inspection for RTT monitoring, and possibly also for packet loss monitoring. Contrarily to the rest of the document, this section seems speculative in nature. It discusses the possible effects of transport header encryption on the possible deployment of new services, which do not appear to be based on any IETF standard. I think the document would be stronger if some of the content of section 7 was moved to the appropriate part of section 2, and if the speculative statements were published as a separate document.We need to look at section 7 closer and will do so either in the next version or one that follows shortly there after. It is much shorter now, and that's good. OK, as long as I am not presented as endorsing the weakening of encryption...8. Response to Increased Encryption and Looking Forward Looks reasonable.Thanks for your review and helpful comments! Would you be okay with an ack for your review and comments in the draft? -- Christian Huitema |