Re: general protection fault in can_rx_register

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oliver,

I decreased the CC list a bit, as I'm more like thinking in the wild
now:

Since the problems happens only rarely, and with vxcan, I assume not
vcan, I started to think to locking issues.

1. What surprised me a bit is 'rtnl_dereference()' calls, without
rcu_read_lock() around it? is that supposed to be ok?

2. is it possible to call vxcan_dellink in between the 2
rcu_assign_pointer() calls in vxcan_newlink(), resulting in a dead end,
i.e. one end is referenced, but already deleted?
I'd expect a kind of rcu_write_lock around the cross-linking at least.

It still puzzles me how this bisected to CAN_REQUIRED_SIZE() macro
commit.

Kurt

On ma, 20 jan 2020 23:35:16 +0100, Oliver Hartkopp wrote:
> 
> Answering myself ...
> 
> On 20/01/2020 23.02, Oliver Hartkopp wrote:
> 
> >
> >Added some code to check whether dev->ml_priv is NULL:
> >
> >~/linux$ git diff
> >diff --git a/net/can/af_can.c b/net/can/af_can.c
> >index 128d37a4c2e0..6fb4ae4c359e 100644
> >--- a/net/can/af_can.c
> >+++ b/net/can/af_can.c
> >@@ -463,6 +463,10 @@ int can_rx_register(struct net *net, struct
> >net_device *dev, canid_t can_id,
> >         spin_lock_bh(&net->can.rcvlists_lock);
> >
> >         dev_rcv_lists = can_dev_rcv_lists_find(net, dev);
> >+       if (!dev_rcv_lists) {
> >+               pr_err("dev_rcv_lists == NULL! %p\n", dev);
> >+               goto out_unlock;
> >+       }
> >         rcv_list = can_rcv_list_find(&can_id, &mask, dev_rcv_lists);
> >
> >         rcv->can_id = can_id;
> >@@ -479,6 +483,7 @@ int can_rx_register(struct net *net, struct net_device
> >*dev, canid_t can_id,
> >         rcv_lists_stats->rcv_entries++;
> >         rcv_lists_stats->rcv_entries_max =
> >max(rcv_lists_stats->rcv_entries_max,
> >
> >rcv_lists_stats->rcv_entries);
> >+out_unlock:
> >         spin_unlock_bh(&net->can.rcvlists_lock);
> >
> >         return err;
> >
> >And the output (after some time) is:
> >
> >[  758.505841] netlink: 'crash': attribute type 1 has an invalid length.
> >[  758.508045] bond7148: (slave vxcan1): The slave device specified does
> >not support setting the MAC address
> >[  758.508057] bond7148: (slave vxcan1): Error -22 calling dev_set_mtu
> >[  758.532025] bond10413: (slave vxcan1): The slave device specified does
> >not support setting the MAC address
> >[  758.532043] bond10413: (slave vxcan1): Error -22 calling dev_set_mtu
> >[  758.532254] dev_rcv_lists == NULL! 000000006b9d257f
> >[  758.547392] netlink: 'crash': attribute type 1 has an invalid length.
> >[  758.549310] bond7145: (slave vxcan1): The slave device specified does
> >not support setting the MAC address
> >[  758.549313] bond7145: (slave vxcan1): Error -22 calling dev_set_mtu
> >[  758.550464] netlink: 'crash': attribute type 1 has an invalid length.
> >[  758.552301] bond7146: (slave vxcan1): The slave device specified does
> >not support setting the MAC address
> >
> >So we can see that we get a ml_priv pointer which is NULL which should not
> >be possible due to this:
> >
> >https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/can/dev.c#n743
> 
> This reference doesn't point to the right code as vxcan has its own handling
> do assign ml_priv in vxcan.c .
> 
> >Btw. the variable 'size' is set two times at the top of alloc_candev_mqs()
> >depending on echo_skb_max. This looks wrong.
> 
> No. It looks right as I did not get behind the ALIGN() macro at first sight.
> 
> But it is still open why dev->ml_priv is not set correctly in vxcan.c as all
> the settings for .priv_size and in vxcan_setup look fine.
> 
> Best regards,
> Oliver



[Index of Archives]     [Automotive Discussions]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [CAN Bus]

  Powered by Linux