On 10.02.20 10:40, Eugenio Perez Martin wrote: > Hi Christian. > > I'm not able to reproduce the failure with eccb852f1fe6bede630e2e4f1a121a81e34354ab commit. Could you add more data? Your configuration (libvirt or qemu line), and host's dmesg output if any? I do the following in the guest: ping -c 200 -f somevalidip; reboot sometimes I need to do that multiple times and sometimes I do not get a guest crash but host dmesg like Guest moved used index from 0 to 292 xml is pretty simple <interface type='direct'> <mac address='52:54:00:7c:2c:f3'/> <source dev='encbd00' mode='bridge'/> <model type='virtio'/> <driver name='vhost'/> <address type='ccw' cssid='0xfe' ssid='0x0' devno='0x0001'/> </interface> Reverting this patch seems to make both problems go away. > > Thanks! > > On Fri, Feb 7, 2020 at 9:13 AM Christian Borntraeger <borntraeger@xxxxxxxxxx <mailto:borntraeger@xxxxxxxxxx>> wrote: > > > > On 07.02.20 08:58, Michael S. Tsirkin wrote: > > On Fri, Feb 07, 2020 at 08:47:14AM +0100, Christian Borntraeger wrote: > >> Also adding Cornelia. > >> > >> > >> On 06.02.20 23:17, Michael S. Tsirkin wrote: > >>> On Thu, Feb 06, 2020 at 04:12:21PM +0100, Christian Borntraeger wrote: > >>>> > >>>> > >>>> On 06.02.20 15:22, eperezma@xxxxxxxxxx <mailto:eperezma@xxxxxxxxxx> wrote: > >>>>> Hi Christian. > >>>>> > >>>>> Could you try this patch on top of ("38ced0208491 vhost: use batched version by default")? > >>>>> > >>>>> It will not solve your first random crash but it should help with the lost of network connectivity. > >>>>> > >>>>> Please let me know how does it goes. > >>>> > >>>> > >>>> 38ced0208491 + this seem to be ok. > >>>> > >>>> Not sure if you can make out anything of this (and the previous git bisect log) > >>> > >>> Yes it does - that this is just bad split-up of patches, and there's > >>> still a real bug that caused worse crashes :) > >>> > >>> So I just pushed batch-v4. > >>> I expect that will fail, and bisect to give us > >>> vhost: batching fetches > >>> Can you try that please? > >>> > >> > >> yes. > >> > >> eccb852f1fe6bede630e2e4f1a121a81e34354ab is the first bad commit > >> commit eccb852f1fe6bede630e2e4f1a121a81e34354ab > >> Author: Michael S. Tsirkin <mst@xxxxxxxxxx <mailto:mst@xxxxxxxxxx>> > >> Date: Mon Oct 7 06:11:18 2019 -0400 > >> > >> vhost: batching fetches > >> > >> With this patch applied, new and old code perform identically. > >> > >> Lots of extra optimizations are now possible, e.g. > >> we can fetch multiple heads with copy_from/to_user now. > >> We can get rid of maintaining the log array. Etc etc. > >> > >> Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx <mailto:mst@xxxxxxxxxx>> > >> > >> drivers/vhost/test.c | 2 +- > >> drivers/vhost/vhost.c | 39 ++++++++++++++++++++++++++++++++++----- > >> drivers/vhost/vhost.h | 4 +++- > >> 3 files changed, 38 insertions(+), 7 deletions(-) > >> > > > > > > And the symptom is still the same - random crashes > > after a bit of traffic, right? > > random guest crashes after a reboot of the guests. As if vhost would still > write into now stale buffers. >