Re: repeatable crash in librbd1

Jason Dillaman <jdillama@xxxxxxxxxx> · Tue, 28 Jul 2020 09:52:17 -0400

On Tue, Jul 28, 2020 at 9:44 AM Johannes Naab
<johannes.naab@xxxxxxxxxxxxxxxx> wrote:
>
> On 2020-07-28 14:49, Jason Dillaman wrote:
> >> VM in libvirt with:
> >> <pre>
> >>     <disk type='network' device='disk'>
> >>       <driver name='qemu' type='raw' discard='unmap'/>
> >>       <source protocol='rbd' name='pool/disk' index='4'>
> >>         <!-- omitted -->
> >>       </source>
> >>       <iotune>
> >>         <read_bytes_sec>209715200</read_bytes_sec>
> >>         <write_bytes_sec>209715200</write_bytes_sec>
> >>         <read_iops_sec>5000</read_iops_sec>
> >>         <write_iops_sec>5000</write_iops_sec>
> >>         <read_bytes_sec_max>314572800</read_bytes_sec_max>
> >>         <write_bytes_sec_max>314572800</write_bytes_sec_max>
> >>         <read_iops_sec_max>7500</read_iops_sec_max>
> >>         <write_iops_sec_max>7500</write_iops_sec_max>
> >>         <read_bytes_sec_max_length>60</read_bytes_sec_max_length>
> >>         <write_bytes_sec_max_length>60</write_bytes_sec_max_length>
> >>         <read_iops_sec_max_length>60</read_iops_sec_max_length>
> >>         <write_iops_sec_max_length>60</write_iops_sec_max_length>
> >>       </iotune>
> >>     </disk>
> >> </pre>
> >>
> >> workload:
> >> <pre>
> >> fio --rw=write --name=test --size=10M
> >> timeout 30s fio --rw=write --name=test --size=20G
> >> timeout 3m fio --rw=write --name=test --size=20G --direct=1
> >> timeout 1m fio --rw=randrw --name=test --size=20G --direct=1
> >> timeout 10s fio --numjobs=8 --rw=randrw --name=test --size=1G --direct=1
> >> # the backtraces are then observed while the following command is running
> >> fio --ioengine=libaio --iodepth=16 --numjobs=8 --rw=randrw --name=test --size=1G --direct=1
> >
> > I'm not sure I understand this workload. Are you running these 6 "fio"
> > processes sequentially or concurrently? Does it only crash on that
> > last one? Do you have "exclusive-lock" enabled on the image since
> > "--numjobs 8" would cause lots of lock fighting if it was enabled.
>
> The workload is a virtual machine with the above libvirt device
> configuration. Within that virtual machine, the workload is run
> sequentially (as script crash.sh) on the xfs formatted device.
>
> I.e. librbd/ceph should only the one qemu process, which is then running
> the workload.
>
> Only the last fio invocation causes the problems.
> When skipping some (I did not test it exhaustively) of the fio
> invocations, the crash is no longer reliably triggered.

Hmm, all those crash backtraces are in
"AioCompletion::complete_event_socket", but QEMU does not have any
code that utilizes the event socket notification system. AFAIK, only
the fio librbd engine has integrated with that callback system.

> > Are all the crashes seg faults? They all seem to hint that the
> > internal ImageCtx instance was destroyed somehow while there was still
> > in-flight IO. If the crashes appeared during the "timeout XYZ fio ..."
> > calls, I would think it's highly likely that "fio" is incorrectly
> > closing the RBD image while there was still in-flight IO via its
> > signal handler.
>
> They are all segfaults of the qemu process, captured on the host system.
> librbd should not see any image open/closing during the workload run
> within the VM.
> The `timeout` is used to approximate the initial (manual) workload
> generation, which caused a crash.
>

-- 
Jason
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx