On Tue, 2018-03-13 at 12:35 -0700, Stephen Hemminger wrote: > On Tue, 13 Mar 2018 20:06:50 +0100 > Mohammed Gamal <mgamal@xxxxxxxxxx> wrote: > > > Dring high network traffic changes to network interface parameters > > such as number of channels or MTU can cause a kernel panic with a > > NULL > > pointer dereference. This is due to netvsc_device_remove() being > > called and deallocating the channel ring buffers, which can then be > > accessed by netvsc_send_pkt() before they're allocated on calling > > netvsc_device_add() > > > > The patch fixes this problem by checking the channel state and > > returning > > ENODEV if not yet opened. We also move the call to > > hv_ringbuf_avail_percent() > > which may access the uninitialized ring buffer. > > > > Signed-off-by: Mohammed Gamal <mgamal@xxxxxxxxxx> > > --- > > drivers/net/hyperv/netvsc.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/hyperv/netvsc.c > > b/drivers/net/hyperv/netvsc.c > > index 0265d70..44a8358 100644 > > --- a/drivers/net/hyperv/netvsc.c > > +++ b/drivers/net/hyperv/netvsc.c > > @@ -757,7 +757,7 @@ static inline int netvsc_send_pkt( > > struct netdev_queue *txq = netdev_get_tx_queue(ndev, > > packet->q_idx); > > u64 req_id; > > int ret; > > - u32 ring_avail = hv_ringbuf_avail_percent(&out_channel- > > >outbound); > > + u32 ring_avail; > > > > nvmsg.hdr.msg_type = NVSP_MSG1_TYPE_SEND_RNDIS_PKT; > > if (skb) > > @@ -773,7 +773,7 @@ static inline int netvsc_send_pkt( > > > > req_id = (ulong)skb; > > > > - if (out_channel->rescind) > > + if (out_channel->rescind || out_channel->state != > > CHANNEL_OPENED_STATE) > > return -ENODEV; > > > > if (packet->page_buf_cnt) { > > @@ -791,6 +791,7 @@ static inline int netvsc_send_pkt( > > VMBUS_DATA_PACKET_FLAG_COMP > > LETION_REQUESTED); > > } > > > > + ring_avail = hv_ringbuf_avail_percent(&out_channel- > > >outbound); > > if (ret == 0) { > > atomic_inc_return(&nvchan->queue_sends); > > > > Thanks for your patch. Yes there are races with the current update > logic. The root cause goes higher up in the flow; the send queues > should > be stopped before netvsc_device_remove is called. Solving it where > you tried > to is racy and not going to work reliably. > > Network patches should go to netdev@xxxxxxxxxxxxxxx > > You can't move the ring_avail check until after the vmbus_sendpacket > because > that will break the flow control logic. > Why? I don't see ring_avail being used before that point. > Instead, you should just move the avail_read check until just after > the existing rescind > check. > > Also, you shouldn't need to check for OPENED_STATE, just rescind is > enough. That rarely mitigated the race. channel->rescind flag is set on vmbus exit - called on module unload - and when a rescind offer is received from the host, which AFAICT doesn't happen on every call to netvsc_device_remove, so it's quite possible that the ringbuffer is accessed before it's allocated again on channel open and hence the check for OPENED_STAT - which is only set after all vmbus data is initialized. > > > > _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel