On Thu, May 28, 2015 at 12:58:59AM +0300, Or Gerlitz wrote: > On Wed, May 27, 2015 at 8:53 PM, Doug Ledford <dledford@xxxxxxxxxx> wrote: > > > On Wed, 2015-05-27 at 11:11 -0600, Jason Gunthorpe wrote: > > Well, let's just be clear: netlink/iproute2 screwed the pooch on their > > implementation of this stuff. Any solution that doesn't include fixing > > this up in some way is not really a good solution. > > >> For instance iproute2 would need IB specific code to format > >> the 'ip link show' (review print_vfinfo in iproute2) and to length > >> check 'ip link set vf mac' > > >> If we do use 8, then it would be ideal (and my strong preference) to > >> also fix the IFLA_VF_MAC message to have a working length. I think > >> that could be done compatibly with a bit of work. At least that way > >> iproute2 can be kept clean when it learns to do IB, and we could have > >> the option again of using 20 someday if we need. > >> > > > Sounds like a reasonable plan. > > Or, this is your patch set, are you going to pick up these action items? > > Let see > > >> So to be clear, to go with the 8 byte option I suggest: > > >> - Engage netdev/iproute and confirm they are philosophically OK > >> with IFLA_VF_MAC != IFLA_ADDRESS > > the last thing netdev are is having philosophical views, they are very > practical (and non-perfect and happy and lively community), the one > thing before the last netdev are is caring for the rdma subsystem. If > something has to change @ their direct part, we should come with > patches. Yes, you need to talk in patches, I will suggest you start with this (totally untested, quickly thrown together): diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 8de36824018d..715b59e9925a 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1172,7 +1172,7 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, nla_nest_cancel(skb, vfinfo); goto nla_put_failure; } - if (nla_put(skb, IFLA_VF_MAC, sizeof(vf_mac), &vf_mac) || + if (nla_put(skb, IFLA_VF_MAC, sizeof(u32) + dev->addr_len, &vf_mac) || nla_put(skb, IFLA_VF_VLAN, sizeof(vf_vlan), &vf_vlan) || nla_put(skb, IFLA_VF_RATE, sizeof(vf_rate), &vf_rate) || @@ -1292,7 +1292,8 @@ static const struct nla_policy ifla_vfinfo_policy[IFLA_VF_INFO_MAX+1] = { }; static const struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = { - [IFLA_VF_MAC] = { .len = sizeof(struct ifla_vf_mac) }, + [IFLA_VF_MAC] = { .type = NLA_BINARY, + .len = sizeof(struct ifla_vf_mac) }, [IFLA_VF_VLAN] = { .len = sizeof(struct ifla_vf_vlan) }, [IFLA_VF_TX_RATE] = { .len = sizeof(struct ifla_vf_tx_rate) }, [IFLA_VF_SPOOFCHK] = { .len = sizeof(struct ifla_vf_spoofchk) }, @@ -1447,6 +1448,21 @@ static int do_setvfinfo(struct net_device *dev, struct nlattr *attr) switch (nla_type(vf)) { case IFLA_VF_MAC: { struct ifla_vf_mac *ivm; + int sz; + + /* Legacy compatability, old iproute2 will pass in 32 + * bytes for everything, assume it is a 6 byte MAC + * because that is all old iproute really supported. + */ + sz = nla_len(attr); + if (sz == 36) + sz = 10; + + if (sz != (sizeof(u32) + dev->addr_len)) { + err = -EINVAL; + break; + } + ivm = nla_data(vf); err = -EOPNOTSUPP; if (ops->ndo_set_vf_mac) And make the argument why the above is OK. - Reader side only ever read 6 bytes anyhow, so it doesn't matter that ifla_vf_mac is undersized in the RTA_DATA - Confirm any other users of this (openstack?) you know are similarly OK. - Old write side will always send the full 36 byte struct, so assume 6 bytes. - Justify this as needing to add IB support next which does not use a 6 byte addr_len New code that wants to work with IB will have to use correct lengths. Then based on that iproute2 can be fixed in the two areas I pointed out, doing that at the same time will probably earn some rep. Finally you can make a patch that switches addr_len to something else - that would trigger the philosophical question of 'what actually is a IFLA_VF_MAC', if everyone is OK with changing it then golden. > >> - Make a iproute patch to use the IFLA_VF_MAC size in print_vfinfo > >> instead of hardcoded ETH_ALEN (using len == 32 mean len 6 for compat) > > I was thinking to patch iproute to sense the link type: if eth print > six bytes, if ipoib print 8 bytes, simple. This is against the design ideals of netlink: a tool like iproute should not need link specific knowledge to parse kernel messages. Now is the best time to make this fix because we can use the sz == 36 so sz == 10 assumption. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html