Re: REGRESSION: RIP: 0010:skb_release_data+0xb8/0x1e0 in vhost/tun

Igor Raits <igor@xxxxxxxxxxxx> · Mon, 25 Mar 2024 09:44:23 +0100

Hello,

On Fri, Mar 22, 2024 at 12:19 PM Igor Raits <igor@xxxxxxxxxxxx> wrote:
>
> Hi Jason,
>
> On Fri, Mar 22, 2024 at 9:39 AM Igor Raits <igor@xxxxxxxxxxxx> wrote:
> >
> > Hi Jason,
> >
> > On Fri, Mar 22, 2024 at 6:31 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> > >
> > > On Thu, Mar 21, 2024 at 5:44 PM Igor Raits <igor@xxxxxxxxxxxx> wrote:
> > > >
> > > > Hello Jason & others,
> > > >
> > > > On Wed, Mar 20, 2024 at 10:33 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Tue, Mar 19, 2024 at 9:15 PM Igor Raits <igor@xxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > Hello Stefan,
> > > > > >
> > > > > > On Tue, Mar 19, 2024 at 2:12 PM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On Tue, Mar 19, 2024 at 10:00:08AM +0100, Igor Raits wrote:
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > We have started to observe kernel crashes on 6.7.y kernels (atm we
> > > > > > > > have hit the issue 5 times on 6.7.5 and 6.7.10). On 6.6.9 where we
> > > > > > > > have nodes of cluster it looks stable. Please see stacktrace below. If
> > > > > > > > you need more information please let me know.
> > > > > > > >
> > > > > > > > We do not have a consistent reproducer but when we put some bigger
> > > > > > > > network load on a VM, the hypervisor's kernel crashes.
> > > > > > > >
> > > > > > > > Help is much appreciated! We are happy to test any patches.
> > > > > > >
> > > > > > > CCing Michael Tsirkin and Jason Wang for vhost_net.
> > > > > > >
> > > > > > > >
> > > > > > > > [62254.167584] stack segment: 0000 [#1] PREEMPT SMP NOPTI
> > > > > > > > [62254.173450] CPU: 63 PID: 11939 Comm: vhost-11890 Tainted: G
> > > > > > > >    E      6.7.10-1.gdc.el9.x86_64 #1
> > > > > > >
> > > > > > > Are there any patches in this kernel?
> > > > > >
> > > > > > Only one, unrelated to this part. Removal of pr_err("EEVDF scheduling
> > > > > > fail, picking leftmost\n"); line (reported somewhere few months ago
> > > > > > and it was suggested workaround until proper solution comes).
> > > > >
> > > > > Btw, a bisection would help as well.
> > > >
> > > > In the end it seems like we don't really have "stable" setup, so
> > > > bisection looks to be useless but we did find few things meantime:
> > > >
> > > > 1. On 6.6.9 it crashes either with unexpected GSO type or usercopy:
> > > > Kernel memory exposure attempt detected from SLUB object
> > > > 'skbuff_head_cache'
> > >
> > > Do you have a full calltrace for this?
> >
> > I have shared it in one of the messages in this thread.
> > https://marc.info/?l=linux-virtualization&m=171085443512001&w=2
> >
> > > > 2. On 6.7.5, 6.7.10 and 6.8.1 it crashes with RIP:
> > > > 0010:skb_release_data+0xb8/0x1e0
> > >
> > > And for this?
> >
> > https://marc.info/?l=linux-netdev&m=171083870801761&w=2
> >
> > > > 3. It does NOT crash on 6.8.1 when VM does not have multi-queue setup
> > > >
> > > > Looks like the multi-queue setup (we have 2 interfaces × 3 virtio
> > > > queues for each) is causing problems as if we set only one queue for
> > > > each interface the issue is gone.
> > > > Maybe there is some race condition in __pfx_vhost_task_fn+0x10/0x10 or
> > > > somewhere around?
> > >
> > > I can't tell now, but it seems not because if we have 3 queue pairs we
> > > will have 3 vhost threads.
> > >
> > > > We have noticed that there are 3 of such functions
> > > > in the stacktrace that gave us hints about what we could try…
> > >
> > > Let's try to enable SLUB_DEBUG and KASAN to see if we can get
> > > something interesting.
> >
> > We were able to reproduce it even with 1 vhost queue... And now we
> > have slub_debug + kasan so I hopefully have more useful data for you
> > now.
> > I have attached it for better readability.
>
> Looks like we have found a "stable" kernel and that is 6.1.32. The
> 6.3.y is broken and we are testing 6.2.y now.
> My guess it would be related to virtio/vsock: replace virtio_vsock_pkt
> with sk_buff that was done around that time but we are going to test,
> bisect and let you know more.

So we have been trying to bisect it but it is basically impossible for
us to do so as the ICE driver was quite broken for most of the release
cycle so we have no networking on 99% of the builds and we can't test
such a setup.
More specifically, the bug was introduced between 6.2 and 6.3 but we
could not get much further. The last good commit we were able to test
was f18f9845f2f10d3d1fc63e4ad16ee52d2d9292fa and then after 20 commits
where we had no networking we gave up.

If you have some suspicious commit(s) we could revert - happy to test.

Thanks again.