Re: [NLM] 2.6.27.14 breakage when grace period expires

Chuck Lever <Chuck.Lever@xxxxxxxxxx> · Thu, 12 Feb 2009 16:43:46 -0500

On Feb 12, 2009, at 3:54 PM, Trond Myklebust wrote:
On Thu, 2009-02-12 at 15:43 -0500, Chuck Lever wrote:
On Feb 12, 2009, at 3:27 PM, Trond Myklebust wrote:
On Thu, 2009-02-12 at 15:11 -0500, Chuck Lever wrote:
On Feb 12, 2009, at 2:43 PM, Trond Myklebust wrote:
On Thu, 2009-02-12 at 14:35 -0500, Chuck Lever wrote:
I wasn't sure exactly where the compared addresses came from.  I
had
assumed that they all came through the listener, so we wouldn't
need
this kind of translation.  It shouldn't be difficult to map
addresses
passed in via nlmclnt_init() to AF_INET6.

But this is the kind of thing that makes "falling back" to an
AF_INET
listener a little challenging.  We will have to record what  
flavor
the
listener is and do a translation depending on what listener  
family
was
actually created.

Why? Should we care whether we're receiving IPv4 addresses or IPv6
v4-mapped addresses? They're the same thing...

The problem is the listener family is now decided at run-time.   
If an
AF_INET6 listener can't be created, an AF_INET listener is created
instead, even if CONFIG_IPV6 || CONFIG_IPV6_MODULE is enabled.   
If an
AF_INET listener is created, we get only IPv4 addresses in  
svc_rqst-
rq_addr.

You're missing my point. Why should we care if it's one or the
other? In
the NFSv4 case, we v4map all IPv4 addresses _unconditionally_ if it
turns out that CONFIG_IPV6 is enabled.

IOW: we always compare IPv6 addresses.

The reason we might care in this case is nlm_cmp_addr() is executed
more frequently than nfs_sockaddr_match_ipaddr().

Mapping the server address in nlmclnt_init() means we translate the
server address once and are done with it.  We never have to map
incoming AF_INET addresses in NLM requests, and we don't have the
extra conditionals every time we go through nlm_cmp_addr().

This keeps nlm_cmp_addr() as simple as it can be: it compares only  
two
AF_INET addresses or two AF_INET6 addresses.

I don't see how that changes the general principle. All it means is  
that
you should be caching v4 mapped addresses instead of ipv4 addresses.
That would allow you to simplify nlm_cmp_addr() even further...

Operationally we have to support both AF_INET and AF_INET6 addresses  
in the cache, because we don't know what kind of lockd listener can be  
created until runtime.  So, I can't see how we can eliminate the  
AF_INET arm in nlm_cmp_addr() unless we unconditionally convert all  
incoming AF_INET addresses from putative PF_INET listeners _and_  
convert incoming IPv4 server addresses in NFS mount requests to  
AF_INET6.

Doesn't that add computational overhead to a fairly common case?

This goes away if we ensure that the address family of the server  
address passed to nlmclnt_lookup_host() always matches the protocol  
family of lockd's listener sockets.  Then address mapping overhead is  
entirely removed from the common cases involving PF_INET listeners.

For PF_INET6 listeners, incoming IPv4 addresses are already mapped by  
the underlying network layer.  Nothing can be done about that.  But we  
can make sure the address family of the server address passed to  
nlmclnt_lookup_host() matches the incoming mapped addresses to  
eliminate the need for nlm_cmp_addr() to do the mapping every time it  
wants to compare an address.

It should be fairly simple to record the listener's protocol family,  
check it against incoming server addresses in nlmclnt_init(), then map  
the address as needed.

Having nlm_cmp_addr() do the mapping solves some problems, but at the  
cost of extra CPU time every time it is called; each loop iteration in  
nlm_lookup_host() for example.  All I'm doing is removing a loop  
invariant, essentially.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html