Re: reproducible rbd-nbd crashes

Marc Schöchlin <ms@xxxxxxxxxx> · Fri, 13 Sep 2019 14:15:33 +0200

Hello Jason,

Am 12.09.19 um 16:56 schrieb Jason Dillaman:
> On Thu, Sep 12, 2019 at 3:31 AM Marc Schöchlin <ms@xxxxxxxxxx> wrote:
>
> Whats that, have we seen that before? ("Numerical argument out of domain")
> It's the error that rbd-nbd prints when the kernel prematurely closes
> the socket ... and as we have already discussed, it's closing the
> socket due to the IO timeout being hit ... and it's hitting the IO
> timeout due to a deadlock due to memory pressure from rbd-nbd causing
> IO to pushed from the XFS cache back down into rbd-nbd.
Okay.
>
>> I can try that, but i am skeptical, i am note sure that we are searching on the right place...
>>
>> Why?
>> - we run hundreds of heavy use rbd-nbd instances in our xen dom-0 systems for 1.5 years now
>> - we never experienced problems like that in xen dom0 systems
>> - as described these instances run 12.2.5 ceph components with kernel 4.4.0+10
>> - the domU (virtual machines) are interacting heavily with that dom0 are using various filesystems
>>    -> probably the architecture of the blktap components leads to different io scenario : https://wiki.xenproject.org/wiki/Blktap
> Are you running a XFS (or any) file system on top of the NBD block
> device in dom0? I suspect you are just passing raw block devices to
> the VMs and therefore they cannot see the same IO back pressure
> feedback loop.

No, we do not make directly use of a filesystem in dom0 on thet nbd device.

Our scenrio is:
The xen dom0 maps the NBD devices and connects them via tapdisk to the blktap/blkback infrastructure.
(https://wiki.xenproject.org/wiki/File:Blktap$blktap_diagram_differentSymbols.png, you can ignore the right upper quadrant of the diagram - tapdisk just maps the nbd device)
The blktap/blkback in xen dom0 infrastructure is using the device channel (shared memory ring) to communicate with the vm (domU) using the blkfrnt infrastructure an vice versa.
The device is exposed as a /dev/xvd<X> device. These devices are used by our virtualized systems as raw devices for disks (using partitions) of for lvm.

I do not know the xen internals, but I suppose that this usage scenario leads to homogenous sizes of io requests because it seems to be difficult to implement a ringlist using shared memory....
Probably a situation which reduces the probability of rbd-nbd crashes dramatically.

>> Nevertheless i will try EXT4 on another system.....

I converted the filesystem to a ext4 filesystem.

I completely deleted the entire rbd ec image and its snapshots (3) and recreated it.
After mapping and mounting i executed the following command:

sysctl vm.dirty_background_ratio=0

Lets see, what we get now....

Regards
Marc

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com