Dear John, On Fri 2002.11.01, John Stracke wrote: > V Guruprasad wrote: > > >- eliminates sockaddr_t handling in the user space, allowing > > application code to become free of IPv4/IPv6 (or for that matter > > raw Ethernet or ATM) dependencies; > > > Doesn't using a shared library for the resolver give you the same > benefit? It's in user space, but it's not in the app. Yes, sockaddr_t can be eliminated using a shared library, but in order to do that, we must replace the following steps: server side: 1 hostent_t haddr = gethostbyname (char* namepath); 2 sockaddr_t sockaddr := { haddr, int port }; 3 fd_t sock = socket (domain, type, protocol); 4 listen (sock); /* optional */ 5 bind (sock, sockaddr, sizeof (sockaddr)); client side: [1,2,3] and 7 int addrlen = sizeof (sockaddr); 8 fd_t sock = connect (sock, sockaddr, &addrlen); with one function call 9 fd_t sock = sockaddressless_socket_open ( char* namepath, int type, int protocol, int port ); This is only one step removed from the INFS version 10 fd_t sock = open (char* namepath, int flags, [int omode]); if the port, type and protocol parameters are included in the namepath as e.g. /com/ibm/research/www/rtsptoolkit/tcp:80. The real motivation for going this extra step was the thesis that emerged in the incomplete sigcomm02 submission, for which we (my immediate ex-colleagues and I) had two important ingredients begging for a top-down architectural story: A) an alternative namespace-based "addressless networking" algorithm presented at ECUMN'00, together with a prototype implementation (as MS project) demo'd at INET'01; and B) an address/route auto-aggregation algorithm comprising a Huffman-like recursive address assignment scheme applied to Dijkstra routing trees. (A) says that a distributed network namespace is sufficient as a primary addressing mechanism (the proof involves a basic property of an unshared distributed tree as a self-defining network address space), i.e. not depending on a manually managed/coordinated numeric address space like IP. Thus, (A) facilitates (B) and (B) motivates (A). However, this also means that the end-to-end-ness or long term stationarity of the numeric (IP) addresses should no longer be taken for granted, and that names should be used instead as the primary reference. This makes it imperative to consider an alternative networking API that uses names as addresses. Since the ordinary notion of what constitutes "system" and what "application/user" concerns the operating system boundary, a system calls interface of this form merited consideration. The fact that the open(2) already has almost the desired form, and caters to a hierarchical (filesystem) namespaces as well, made the INFS approach all the more interesting to try out. Yes, a shared library is still the way to go on many platforms, especially on Windows where the socket implementation itself comes from dll's. The filesystem interface is more restrictive, however, and provides a stronger test of the sufficiency of the filesystem/file operations paradigm. > >- reduces the number of context switches going from application > > to resolver and back; > > > Do you have data showing these context switches are a problem? To me, it > seems like you're optimizing something that doesn't take up that much > time anyway--what apps spend that much CPU time on DNS lookups? The context switching reduction was intended only to point out that performance is likely to improve rather than worsen. However, yes, it is one of the things on the to-do list, but I don't know how soon I can get around to it given my current resources (being out of job!). > >- reduces the overall code footprint - the filesystem name tree > > cache is reused, sockaddr_t handling code in applications gone. > > > Again, shared libs also reduce duplicate code (though not data; for that > you do need the kernel, or a daemon). The code reduction is *slightly* more than with just shared library: with an slib, duplications between apps is avoided, but there is at least one slib implementation of parsing and name caching code. With the infs approach, even this much of the slib would be eliminated as the vfs already contains similar code and would be reused. I wholeheartedly agree that this much of code reduction is not all that big a diff today, as memory and cpu cycles are quite cheap and becoming even cheaper by the minute, but if a reduction is possible, it's always educational to try it out. However, the sockaddr_t and VFS integration were the main motivations. thanks, -prasad.