Christian,
Could you please possibly sit with Kathleen to concisely edit
this doc so it can get out the door? You both may not agree on
everything, but perhaps it would be faster than this game of ping
pong we have all been watching. I'd be happy to coordinate a
phone call if it helps.
Eliot
On 01.02.18 02:49, Christian Huitema
wrote:
I am using the side by side feature to review the differences
between draft-13 and draft-15.
I am commenting here on Kathleen's message, to check whether
various points are addressed.
On 1/5/2018 6:07 AM, Kathleen
Moriarty wrote:
Christian,
Thanks for your review. I'll respond inline to make sure we hit each
point raised. The next version posted may not address your points,
but the subsequent update should and I expect to have that out soon.
On Fri, Dec 8, 2017 at 11:25 AM, Kathleen Moriarty
<kathleen.moriarty.ietf@xxxxxxxxx> wrote:
Thank you, Christain and others for your reviews. Our running draft
has addressed most comments received by mid-IETF week. We are working
on the others. We also received some comments from Kyle Rose and
Brandon Williams and are working to address those as well.
We'll respond on list when we get to this set of comments.
Best regards,
Kathleen
On Mon, Dec 4, 2017 at 11:23 AM, Christian Huitema <huitema@xxxxxxxxxxx> wrote:
The high level summary is that draft-mm-wg-effect-encrypt version 13 is
significantly improved from previous versions, but that the document
would benefit a lot from additional work. I am not convinced that the
sections on data center and enterprises belong in this specific document
- they seem too high level to bring serious information to readers.
Maybe stage a separate document to survey enterprise issues in some
depth? I also feel that section 7 is way too speculative for a survey
document.
Section 7 was previously an appendix and maybe belongs back as an
appendix, we need to figure that out. There is one particular case
that a commenter saw the value of access (where he didn't in most
places) that we may need to figure out how to keep that in the body as
helpful text.
OK, waiting for the next draft then.
Since the comments on data center and enterprises is opinion, I'll
leave the text in for now. Thanks.
...
2. Network Service Provider Monitoring
Section 2, "Network Service Provider Monitoring", has been reorganized
to focus on management goals rather than simply provide a list of
existing management tools. The description of the trouble shooting tasks
in section 2.2 is useful. It makes the point that "application server
operators using increased encryption should expect to be called upon
more frequently to assist with debugging and troubleshooting", and that
could lead to some interesting work in the IETF.
There is paragraph at the end of section 2.1.2, Troubleshooting, that
states that "the push for encryption by application providers has been
motivated by the application of the described techniques." I think that
paragraph is misplaced. As far as I can tell, the application providers
are first concerned with "content management" techniques that modify
the data stream. Any change of content has the potential to generate
bugs that are difficult for the application provider to fix. The second
concern is "ossification", when traffic characterization based on
inferred features of the application traffic leads to adverse
consequences when the application or transport protocols evolve. Neither
of those is directly relevant to the "troubleshooting" task. Maybe move
that paragraph higher in the document, e.g. in the introduction of
section 2? If not that, then maybe move it to section 2.2.2, since one
purpose of application encryption is indeed to defeat differential
treatment in the network.
I see your point, but am hesitant to move the text up since there is
no other general description text in that section heading and this
just covers applications. I want to comb through all of section 2
again after this sweep of comments, so we may do a bit more to improve
this as well within a wider scope of organizational changes.
A new paragraph got inserted at the end of section 2.1: "Vendors
must be aware that in order for operators to better troubleshoot
and manage networks with increasing amounts of encrypted traffic,
...". That paragraph appears to use normative text, "must be
enhanced", and loose keywords like "reveal cleartext network
parameters". I don't much like the idea of a "diagnostic" document
making specific recommendations about the actual remedy.
Also, "vendors" is a strange word there, as it normally implies a
customer-vendor relation. AFAIK, there is no such relation between
application service providers like Netflix or Google and the
managers of specific networks.
I get the idea that troubleshooting is a distributed task, that
application providers could help network troubleshooting by
providing better tools, and that doing so would be in everybody's
best interest. But that can be said in a much simpler way than
this additional paragraph, without implying non-existing
contractual relations, and without mandating any specific outcome
to the future discussions.
I find the discussion of load balancers in section 2.2.1 somewhat
confusing. It seems to cover three functions: load balancers in data
centers, load balancers integrated with the network, and a network
management function that tries to maintain proper connectivity to
anycast addresses services in the presence of mobility. It might be
useful to move the discussion of "classic" load balancers to section 3,
and to discuss the problem of anycast continuity in a separate
subsection. The anycast discussion seems to assume that the network
operator alone has to deal with the supposed inadequacies of the
application providers. It seems obvious that this problem would be much
better solved by improved handling of mobility in content distribution
networks, rather than by some complex machinery in the network itself.
This might need to be stated.
I am reading draft-15, and this section is still very confusing.
For example, the text says that "Mobile operators deploy
integrated load balancers to assist with maintaining connection
state as devices migrate. With the proliferation of mobile
connected devices, there is an acute need for connection-oriented
protocols that maintain connections after a network migration by
an endpoint." Yet the definition at the beginning of the section
mentions "integrated load balancers" as "integral part of the
service provided by the server pool behind that load balancer". To
me, "integral part of the service" implies that the load balancer
and the server pool as managed by the same entity, which is
providing the application service. Yet, the discussion after that
seems to imply something rather different -- a middlebox deployed
by the network provider without coordination with the application
service provider.
I would really like to see that text rewritten.
And I stand by my original comment regarding use of anycast by CDN
providers. The current text assumes some unilateral action by the
network provider. It would be much more productive to explain why
anycast can be problematic in high mobility environments, and
suggest a future discussion on how to resolve the problem. For
example, TCPM and QUIC provide support for connection migration,
which would probably yield much better results than guesswork in
the middle of the network.
Section 2.2.2 on Deep Packet Inspection could state that it is very
often possible to classify traffic based on analysis of the encrypted
data.
And now it does. Thanks.
Audio stream, video streams and web traffic have very different
signatures, even when encrypted. At the same time, it should also note
that many application providers are actively working to defeat the
"unilateral" traffic classification enabled by these techniques,
complementing encryption with various techniques like multiplexing or
padding. We could well observe an arms race between more powerful
network based analysis and smarter application hiding.
The discussion of performance enhancing proxies in section 2.2.3 states
that "This optimization at network edges measurably improves real-time
transmission over long delay Internet paths or networks with large
capacity-variation (such as mobile/cellular networks)." This is not a
consensual statement. Operators do indeed hope that deploying such
proxies will improve performance, but independent measurements have
shown that such proxies often in fact degrade performance. The studies
that show improvement tend to be based on old network technologies, or
on ancient TCP stacks. If the authors want to keep a statement like
that, they should add references to actual measurements. At a minimum,
the text should note that many application providers disagree with the
assessment presented here, and that the development of encrypted
transports such as QUIC is largely motivated by the desire to mitigate
the negative effects of such "performance-dehancing" proxies.
Do you have references to the studies you are referring to as that
would be helpful. Thanks.
I am searching for references. But I can propose text:
Not everybody agrees that "performance enhancing proxies" actually
enhance performance in the long term. Proxies are typically
measured against a nominal version of the transport protocol, and
may fix issues encountered in that protocol version in a specific
environment. On the other hand, by "taking responsibility" for the
transport protocol, proxies eschew the future benefits of
transport protocol improvement in the endpoints. For example,
there is active research on better congestion protocols, or better
error recovery algorithms, including algorithms better adapted to
wireless networks. The resulting improvements can be quickly
deployed by means of system updates, or even application updates
in the case of transport protocols like QUIC. Experience shows
that network devices are not updated with the same frequency. An
up-to-date endpoint would have benefited from the protocol
updates, but would be stuck with the legacy performance if it must
still use the legacy "optimizer".
The discussion of caching in section 2.2.5 correctly states the tension
between network usage and application control. It could also state the
inherent privacy risk associated with network based caches: they will
provide a log of which users accessed what cached content. There is a
reference to draft-thomson-http-bc-01, but as far as I know the authors
have abandoned it, in part because they could not solve the related
privacy issues. In any case, that draft expired several month ago, and
the reference is probably not appropriate.
Other comments led to that text being updated for the next version.
The comments received took this as one of many examples, perhaps there
is a better example that could be used instead of this draft?
Alternate approaches such as blind caches <xref
target="I-D.thomson-http-bc"/> are being explored to allow caching
of encrypted content; however, they still require cooperation between
the content owners or CDNs and blind caches and fall outside the scope
of what is covered in this document. Content delegation solves a data
visibility problem with the delegated cache, the impact remains for
the use case where HTTPS encryption limits visibility to offload from
congested links.
This was addressed in a separate thread, thanks.
But now, we get additional text in section 2.2.6 on content
compression, and a new section 2.2.7 on service function chaining.
The text in 2.2.6 basically says that when the network sees the
same segments used on multiple connections, it can compress them
efficiently. That's plausible, but just like in section 2.2.5 we
have to note that there is an explicit trade-off between
compression and privacy. The "segments" that could be compressed
in 2.2.6 are pretty much the same as the segments that could be
cached in 2.2.5. Exposing them to the network for compression has
pretty much the same effect on privacy as exposing them to the
network for caching. The same caveats should apply.
The new section 2.2.7 describes "service function chaining", which
is an architecture for distributed implementation of services. We
should separate there the management function proper, which would
distribute functions already provided by the network provider, and
a future expansion of the network that would provide services
currently offered by application providers. Encryption actually
marks a nice delimitation between these two notions. Network
providers may well convince some application providers to make
greater use of their "service functions", but that's a business
negotiation, not an architecture discussion. If and when
application providers decide to subscribe to the function provided
by the network, these application providers would devise ways to
expose the required data. Or then, maybe not.
I would suggest writing the section 2.2.7 in a rather different
way. Section 2.2 already discusses various functions that could be
provided by the network, or by the application providers, or by
third parties like CDN. We can not the trend to organize these
functions under the "service function chaining" architecture, and
the general limitation that if the functions are impacted by
encryption in the current architecture, they will remain similarly
impacted after the network provider adopts a "service function
chaining" architecture. And leave it at that.
In section 2.3.3, Application Layer Gateways, I was wishing it would say
something about IPv6. But then of course most IPv6 deployments today
involve a form of NAT64...
If you have text to suggest, we'd be happy to incorporate it.
"The deployment of IPv6 may well reduce the need for NAT, and the
corresponding requirement for Application Layer Gateways."
Section 2.3.4 documents the "HTTP Header Insertion" technique. The
relation between that technique and "Network Service Provider
Monitoring" is unclear -- header insertion is certainly not a network
monitoring tool. It is also a highly controversial tool, as documented
for example in
https://www.theverge.com/2016/3/7/11173010/verizon-supercookie-fine-1-3-million-fcc.
I wonder whether it is appropriate to describe this at all in a document
dedicated to network management, and my simple suggestion would be to
just remove that section altogether. Failing that, the text needs to be
modified to note the controversial nature of the process, and its impact
on privacy. The authors could also note that the function could be
trivially implemented in the client's browsers if it was really needed
and approved by the users. There is no technical need to have anything
like that "in the network".
I see your point, but operators wanted this text in, so how about a
modification to the last sentence to try to balance it out more? As I
read the text, it describes invasive uses, so I thought that it was
clearly something the was not without controversy.
When HTTP connections are encrypted to protect users privacy,
mobile network service
providers cannot insert headers to accomplish the, sometimes
considered controversial, functions
above.
s/sometimes/often/?
3. Encryption in Hosting SP Environments
After examining network monitoring in section 2, the draft continues
with an analysis of "Hosting SP Environments" in section 3, and section
4 describes "Encryption for Enterprises". I assume that the initials SP
stand for "Service Provider" -- spelling it out would not hurt. I really
wonder whether these sections belong in the document at all, rather than
being published in separate documents. "Hosting Service Provider
Environments" appears to be a subset of the general "Data Center"
problem. It is true that some network providers also provide data center
services for their customers, but these network providers represent only
a small fraction of the service hosting industry. Similarly, some
network providers provide services to enterprises, but there is a wide
variety of enterprises. It is hard to believe that the authors of an
individual draft have authority to speak at the same time about network
services, data centers, and enterprises. In my opinion, it would be
simpler to just excise section 3 and 4 from this draft, and use the
content as input for specific drafts describing issues in data centers
and enterprises.
I'd prefer to keep these sections separate and think others feel the
same way. Hosted data centers can occur in several layers as well. I
work for a very large company that provides hosted infrastructure as a
service, where others offer hybrid cloud options and other outsourcing
options (application service providers, etc.) on top of this layer.
It seems you may be thinking of this as a higher level of data center
offerings than what is deployed in industry. I'd prefer not to get
into listing them out as we would certainly miss some and it could be
a confusing list that doesn't help the point of the draft. 4 may not
be just for enterprise outsourcing options as in the IAAS example
provided.
I think these sections are weak, and do not bring much to the
readers. But I said that already.
In any case, I am puzzled by the reference to Data Loss Prevention (DLP)
in the introduction of section 3.1. Data exfiltration is indeed a
security issue, but I knew it primarily as an issue in enterprise
networks. It does indeed become an issue in data centers when an
enterprise application is hosted outside the data center, but it is a
bit strange to see the first reference there. I already suggested to
move section 3 and 4 out to a different document. Failing that, I would
suggest reversing the order of section 3 and 4, i.e., discuss enterprise
issues first and data center issues next.
I think it was contributed there as a result of offerings to
customers. Before removing it, we'd have to get confirmation that it
isn't a service provided to many enterprises with outsourced
solutions. I am pretty sure I had confirmed this point previously as
RSA offers multiple DLP solutions at several layers. With a quick
glance, RSA has cloud and data center buzz words in their product
offering descriptions in addition to network and endpoint.
The new text is somewhat better. I wonder whether DLP would merit
a separate subsection.
Of course the really bad opponents started using encryption long
ago, well before the IETF's push for privacy. This is not exactly
a new problem. Also, I would expect them to use a variety of
techniques to disguise and hide their data streams. Hacking
routers comes to mind.
The discussion of Customer Access Monitoring in section 3.1.1 is a bit
strange. Most applications control customer access based on the customer
identity, not based on the IP addresses of the customer -- the whole
point of the "cloud" is that applications can be accessed from anywhere.
Some applications do perform additional checks, mainly as a defense
against stolen credentials, and would attempt to block access if the
network location does not look plausible for this specific user. These
are useful techniques, but the relation with encryption of data is
somewhat thin. It seems to reinforce my point that data center issues
would best be discussed in a separate document.
There are certainly access restrictions based on IP and protocol
information, which a 5-tuple and a 2-tuple are often adequate. If you
are an administrator from a company who outsourced for infrastructure
and some application support, management access of your customer
application might be a simple example where this is still used. Sure,
users are mobile, but they could VPN to their company network and
connect from an approved IP. I think some of the use cases described
are examples that are fine and encryption is a very small shift. In
some of the examples, they serve as possible ways forward for the
other use cases that haven't adapted yet. Boiling down the details
and showing that log improvements and transaction monitoring
improvements in the application or changes in protocols could ease
this transition is important IMO. It's been a helpful process for
some of the participants and I hope it helps for future protocol
development to engage in these tough discussions understanding the
perspectives on either end of this debate better.
The reminder of section 3 appears to be a high level tutorial on the
operation of data centers. It is not clear that there is a particular
problem with encryption there. Indeed, I note that a lot of operators of
big data centers, such as for example AWS, Azure or Google, have
voluntarily pushed for increased used of encryption. I don't learn much
by reading these sections, and I question whether they belong in the draft.
I'll read through it again in a final sweep. We haven't gotten
complaints about this text, but I'll note your concern in my sweep.
In 3.2.1, you could be a bit more explicit about application
logging, and application manageability in general. Also, maybe not
say that "Application logging currently lacks detail..." Is that true of every
application? In general, I agree that it would be a good thing
to delineate the type of information that data center managers
would like to find in logs. Of course, logging too much has its
own issues, such as exposures to breaches of privacy or to
lawsuits. It seems we need to have a robust discussion leading
to some kind of best practice document.
4. Encryption for Enterprises
The discussion on encryption in enterprises would probably benefit from
input by a variety of enterprise network managers. I found the
discussion somewhat hard to read. It seems that the authors want to
tackle three issues: the enterprise as a target for security attacks,
the enterprise as an application provider, and the enterprise as a
network provider. These are discussed in sections 4.1.1, 4.1.2, and
4.1.3.
The description of attacks in 4.1.1 is somewhat high level. It starts
from the statement that "A significant portion of malware hides its
activity within TLS or other encrypted protocols" to draw a requirement
to monitor encrypted traffic, when in practice there are many other
monitoring points, from endpoint monitoring to data base activity logs
to logs at network authentication servers -- as stated in the last
paragraph of the section.
The monitoring of application performance in enterprises appears
strangely focused on the "IPv6 Destination Option Header (DOH)
implementation of Performance and Diagnostic Metrics (PDM)". I
understand that most big applications solve their monitoring need by
implementing some form of telemetry, which is not affected at all by
encryption, yet I see no mention of this telemetry approach in the
discussion.
Different data centers operate with different architectures and
approaches. We'd be happy to get text from other network operators.
I'll also do a sweep of this section and see if I can update it a bit
more to help along these points. It may come in an update following
the next one.
In 4.1.1 there is a discussion relative to BYOD and exfiltration.
Seriously? I mean, if I want to exfiltrate data with my phone, I
can simply copy the data on the phone, and then wait until I am
outside the enterprise network to exfiltrate it. The solutions
that I have seen working involve personalized watermarks in
downloaded content and post facto retribution against the leakers.
That is, solutions designed specifically for the problem, rather
than massive fishing expeditions through encrypted data...
I had a hard time reading section 4.1.3, Enterprise Network Diagnostics
and Troubleshooting. It seems to cover a variety of techniques meant to
monitor application services without actually instrumenting the
application, and as such is not very convincing.
Yes, the problem conveyed to us is that there are way too many
applications that do a poor job of providing troubleshooting data,
that it is hard to get this to an improved state. Since this was
discussed a few years ago, I started putting comments on drafts to
request text on logging to help in the long term. The organizations
with these issues have also started reaching out to offending
application developers, but it a long road. It is recognized that
it's a better long term option. I'll see what we can do with the text
on this front.
OK. What don't you just say so? Proper monitoring of application
behavior requires proper instrumentation of the application. When
applications do not have an adequate instrumentation, managers
often resort to network-based monitoring. This is problematic when
the traffic is encrypted."
My concern is that if an application needs to be updated to allow
monitoring, it is much more beneficial in the long term to do that
by better logging, rather than by somehow weakening the
encryption. We should find a way to say that.
The meat of section 4 appears to be in section 4.2, which is covering
the issue of data loss prevention, and generally detection of data
exfiltration. Again, this is an issue that would be worth a specialized
draft.
5. Security Monitoring for Specific Attack Types
Looks fine, and this review is already very long.
Ack, thanks.
6. Application-based Flow Information Visible to a Network
Do we need this section at all? It seems that most of the information
could be captured by adding a small subsection to 2.1. Passive Monitoring.
IPFIX was added at the request of Benoit to more fully cover network
management protocols in the document. Brian Trammell provided that
text, so the next version will have an improved section 6.
7. Impact on Mobility Network Optimizations and New Services
This section appears to be a mix of replication of statements already
made in section 2, and some speculation on the effect of transport
header encryption, such as deployed in Web RTC (SCTP over DTLS) or
planned in QUIC. There are active discussions in the QUIC WG to provide
alternative to transport header inspection for RTT monitoring, and
possibly also for packet loss monitoring.
Contrarily to the rest of the document, this section seems speculative
in nature. It discusses the possible effects of transport header
encryption on the possible deployment of new services, which do not
appear to be based on any IETF standard. I think the document would be
stronger if some of the content of section 7 was moved to the
appropriate part of section 2, and if the speculative statements were
published as a separate document.
We need to look at section 7 closer and will do so either in the next
version or one that follows shortly there after.
It is much shorter now, and that's good.
8. Response to Increased Encryption and Looking Forward
Looks reasonable.
Thanks for your review and helpful comments! Would you be okay with
an ack for your review and comments in the draft?
OK, as long as I am not presented as endorsing the weakening of
encryption...
-- Christian Huitema
|