On vr, 24 jan 2020 19:43:23 +0100, Oliver Hartkopp wrote: > Hi Kurt, Dmitry, > > On 21/01/2020 21.39, Kurt Van Dijck wrote: > > >Maybe move the crosslinking to before the register, then they're > >inaccessible from userspace. > > I think I found the problem: Well done! > > [ 1814.648904] bond5128: (slave vxcan1): Error -22 calling dev_set_mtu > [ 1814.649124] dev_rcv_lists == NULL! 000000008e41fb06 (bond5128) > > The bonding netdev bond5128 enslaved the vxcan1 netdev. As vxcan1 is a CAN > netdev with ARPHRD_CAN the bonding process executes > You were able to make the syscalls comprehensible then? > if (slave_dev->type != ARPHRD_ETHER) > bond_setup_by_slave(bond_dev, slave_dev); > > in bond_enslave() in .../bonding/bond_main.c > > Which does this: > > static void bond_setup_by_slave(struct net_device *bond_dev, > struct net_device *slave_dev) > { > bond_dev->header_ops = slave_dev->header_ops; > > bond_dev->type = slave_dev->type; > bond_dev->hard_header_len = slave_dev->hard_header_len; > bond_dev->addr_len = slave_dev->addr_len; > > memcpy(bond_dev->broadcast, slave_dev->broadcast, > slave_dev->addr_len); > } > > So bond5128 becomes an ARPHDR_CAN interface BUT without having a > netdev_priv() space which contains our lovely can_ml_priv structure with the > dev_rcv_lists for the CAN filters. > > I was able to confirm the bisected commit but the crashes still were pure > luck IMO. > > can_rx_register() accesses netdev_priv() of the bonding device - but there > are no CAN filters. BAM! > > So we need to make sure that ARPHDR_CAN dev->type can not be enslaved by the > bonding driver. This implies modifying bond_main.c, right? > > Best regards, > Oliver