Chuck Lever wrote:
Hi Roland-
On 04/02/2010 01:22 PM, Roland Dreier wrote:
> > The write_ports code will fail both the INET4 and INET6
transport
> > creation if
> > the transport returns an error when PF_INET6 is specified.
Some transports
> > that do not support INET6 return an error other than
EAFNOSUPPORT.
>
> That's the real bug. Any reason the RDMA RPC transport can't
return
> EAFNOSUPPORT in this case?
I think Tom's changelog is misleading. The problem is that the RDMA
transport actually does support IPv6, but it doesn't support the
IPV6ONLY option yet. So if NFS/RDMA binds to a port for IPv4, then the
IPv6 bind fails because of the port collision.
IPV6ONLY is a requirement for RPC over IPv6. If the underlying
transport does not support IPV6ONLY, then it cannot properly support
RPC over IPv6. It's easy enough to catch listener creation calls for
IPv6 on such transports, and simply return EAFNOSUPPORT until support
for IPV6ONLY can be provided.
The __write_ports() interface is specifically designed to silently
fall back to IPv4-only when IPv6 transport creation fails with
ENOAFSUPPORT. I don't see a good reason to change the generic logic
in __write_ports() if there is a problem with implementing RPC over
IPv6 in a specific transport capability. __write_ports() will do the
right thing if the transport returns the correct error code.
Implementing the IPV6ONLY option for RDMA binding is probably not
feasible for 2.6.34, so the best band-aid for now seems to be Tom's
patch.
My recent experience with similar changes suggests the specific
solution Tom proposed will trigger extra bug reports and e-mails, as
the change appears to affect non-RDMA transports as well. This printk
might fire, for example, for INET transports on systems that are built
without IPv6 support, or where ipv6.ko is blacklisted in user space.
In other words, I agree that there's a bug that should be addressed in
2.6.34, and I don't have any problem with setting up only an IPv4
listener in this case. But I think the addition of a printk that
fires for all transports in this case is problematic.
This makes sense to me.
It would be better to address this in the RPC/RDMA transport
capability, and not in generic upper level logic. We already have
correct behavior in __write_ports, and the RPC/RDMA transport
capability should be changed to use it.
So is seems reasonable to me to fail svc_create_xprt with ("rdma",
PF_INET6) with EAFNOSUPPORT because the RDMA transport does not support
the v4only setsockopt.
I will post a patch that does this.
Thanks,
Tom
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html