Re: What is the right way to do Web Services discovery?

Joe Touch <touch@xxxxxxx> · Tue, 22 Nov 2016 11:03:00 -0800



    Hi, all,
    I'm curious as well, esp. from the perspective of IANA ports.
    IMO, HTTP is missing two key capabilities:
        - a portmapper service, like RPC (yes, this could be mDNS,
      basically)

    
        - a coordination service, to allow processes to register to
      handle subtrees of the URN namespace while sharing a port
    These two features would avoid the ongoing need for "yet another
      copy of HTTP" port assignments.
    Joe

    
    On 11/22/2016 7:04 AM, Phillip
      Hallam-Baker wrote:

    
        I am asking
          here as there seems to be a disagreement in HTTP land and DNS
          land.
        

        Here are the
          constraints as I see them:
        

        0) Foir any
          discovery mechanism to be viable, it must work in 100% of
          cases. That includes IPv4, IPv6 and either with NAT.
        

        1) Attempting
          to introduce new DNS records is a slow process. For practical
          purposes, any discovery mechanism that requires more than SRV
          + TXT is not going to be widely used.
        

        2) Apps area
          seems to have settled on a combination of SRV+TXT as the basis
          for discovery. But right now the way these are used is left to
          individual protocol designers to decide. Which is another way
          of saying 'we don't have a standard'.
        

        3) The DNS
          query architecture as deployed works best if the server can
          anticipate the further requests. So a system that uses only
          SRV+TXT allows for a lot more optimization than one using a
          large number of records.

        
        4) There are
          not enough TCP ports to support all the services one would
          want. Further keeping ports open incurs costs. Pretty much the
          only functionality from HTTP that Web Services make use of is
          the use of the URL stem to effectively create more ports. A
          hundred different Web services can all share port 80.
        

        5) The SRV
          record does not specify the URL stem though. Which means that
          either it has to be specified in some other DNS record (URI or
          TXT path) or it has to follow a convention (i.e.
          .well-known). 
        

        6) Sometimes
          SRV records don't get through and so any robust service has to
          have a strategy for dealing with that situation.
        

        7) If we are
          going to get to a robust defense against traffic analysis, it
          has to be possible to secure the initial TLS handshake, i.e.
          before SNI is performed. This in turn means that it must be
          possible to pull information out of that exchange and into the
          DNS. Right now we don't know what that information is but this
          was not a use case considered by DANE.
        

        8) We are
          probably going to want to transition Web Services to
          'something like QUIC' in the near future. Web Services really
          don't need a lot more than a TCP stream. Most of HTTP just
          gets in the way. But the multiplexing features in QUIC could
          be very useful. 
        

        Right now we
          have different ideas on how this should work in the HTTP space
          and DNS space. And this appears to be fine with the two groups
          as they don't need to talk to each other. But it really isn't
          possible to build real systems unless you offend the purists
          in at least one camp. I think we should do better and offend
          both.
        

        So here is my
          proposal for discovery of a service with IANA protocol label
          'fred'
        

        First the
          service description records. This is a TXT record setting
          policy for all instances of the fred service and a set of SRV
          service advertisements:
        

        _fred._tcp.example.com
          TXT "minv=1.2 maxv=3"
        _fred._tcp.example.com
          SRV 0 100 80 host1.example.com

        
        _fred._tcp.example.com
          SRV 0 100 80 host2.example.com

        
        There is also
          a set of round robin A records for systems behind legacy NAT.
          You could do AAAA as well but these probably aren't needed as
          it is unlikely that a router blocking SRV will pass AAAA
        

        fred.example.com
          A 10.0.0.1
        fred.example.com
          A 10.0.0.2

        
        And finally,
          we have the host description entries
        

        host1.example.com
          A 10.0.0.1

        
        _fred._tcp.host1.example.com
          TXT "minv=1.2 maxv=2 tls=1.2 path=/fred12"

        
        host2.example.com
          A 10.0.0.1

        
        _fred._tcp.host2.example.com
          TXT "tls=1.3"

        
        So here we
          have some host level service description tags which obviously
          override the ones specified at the service level. With the
          proviso that a client might well abort if the service level
          description suggests there is no acceptable host. The path
          descriptor allows the use of the well known service to be
          avoided on host1. It defaults on host2
        

        In the normal
          run of things, a DNS server would recognize that a request for
          _fred._tcp.example.com
          SRV was likely the start of a request chain and send all the
          records describing the service in a single bundle. This should
          usually fit in a single UDP response.
        

        This approach
          gives us two levers allowing us to set policy for the service.
          We can define policy for all service instances or granular per
          host information.
        

        The bit that
          I have not got nailed down is what the HTTP URL should be
          after the service discovery is performed. My view is that they
          should be these:
        

        http://host1.example.com/fred12
        http://host2.example.com/.well-known/fred

        
        Which works
          nicely with the existing code and but not for TLS operations.
          We will either need certs for host1.example.com and host2.example.com
          or have to override the TLS stack to accept certs for example.com.
        

        The problem
          becomes even more apparent if the redirects are to host1.cloudly.com
          and host2.cloudly.com
          where cloudly is a cloud service provider. So the alternative
          is to do this:
        

          http://example.com/fred12
          http://example.com/.well-known/fred
        
        
        The problem
          is that it does not work well when trying to use this strategy
          with existing http clients built into scripting languages.
          Instead of just writing a module that does the SRV lookup and
          spits out the URLs and attributes, now we need to rewrite our
          client so it will hit the right DNS address.
        

        Given that
          most libraries seem to have hooks to allow a client to make
          its own TLS certificate path math choices, I am very strongly
          in favor of the first approach. But I am willing to be
          persuaded otherwise.
        

        Comments?