Re: [ceph-users] Re: rbd-nbd crashes Error: failed to read nbd request header: (33) Numerical argument out of domain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ilya,

Recently, we found these patches(v2)
http://archive.lwn.net:8080/linux-kernel/YRHa%2FkeJ4pHP3hnL@T590/T/.
Maybe related?

v3: https://lore.kernel.org/linux-block/20210824141227.808340-2-yukuai3@xxxxxxxxxx/

On Mon, Aug 30, 2021 at 6:34 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
>
> On Tue, Aug 24, 2021 at 11:43 AM Yanhu Cao <gmayyyha@xxxxxxxxx> wrote:
> >
> > Any progress on this? We have encountered the same problem, use the
> > rbd-nbd option timeout=120.
> > ceph version: 14.2.13
> > kernel version: 4.19.118-2+deb10u1
>
> Hi Yanhu,
>
> No, we still don't know what is causing this.
>
> If rbd-nbd is being too slow, perhaps disabling the timeout would help?
> Starting with kernel 5.4, "--io-timeout 0" should do it.
>
> In general, the nbd driver is pretty unstable in older kernels.
> Timeout handling is just one example so I would advise upgrading
> to a recent kernel, e.g. 5.10 LTS.
>
> Thanks,
>
>                 Ilya
>
> >
> > On Wed, May 19, 2021 at 10:55 PM Mykola Golub <to.my.trociny@xxxxxxxxx> wrote:
> > >
> > > On Wed, May 19, 2021 at 11:32:04AM +0800, Zhi Zhang wrote:
> > > > On Wed, May 19, 2021 at 11:19 AM Zhi Zhang <zhang.david2011@xxxxxxxxx>
> > > > wrote:
> > > >
> > > > >
> > > > > On Tue, May 18, 2021 at 10:58 PM Mykola Golub <to.my.trociny@xxxxxxxxx>
> > > > > wrote:
> > > > > >
> > > > > > Could you please provide the full rbd-nbd log? If it is too large for
> > > > > > the attachment then may be via some public url?
> > > > >
> > > > >  ceph.rbd-client.log.bz2
> > > > > <https://drive.google.com/file/d/1TuiGOrVAgKIJ3BUmiokG0cU12fnlQ3GR/view?usp=drive_web>
> > > > >
> > > > > I uploaded it to google driver. Pls check it out.
> > > >
> > > > We found the reader_entry thread got zero byte when trying to read the nbd
> > > > request header, then rbd-nbd exited and closed the socket. But we haven't
> > > > figured out why read zero byte?
> > >
> > > Ok. I was hoping to find some hint in the log, why the read from the
> > > kernel could return without data, but I don't see it.
> > >
> > > From experience it could happen when the rbd-nbd got stack or was too
> > > slow so the kernel failed after timeout, but it looked different in
> > > the logs AFAIR. Anyway you can try increasing the timeout using
> > > rbd-nbd --timeout (--io-timeout in newer versions) option. The default
> > > is 30 sec.
> > >
> > > If it does not help, probably you will find a clue increasing the
> > > kernel debug level for nbd (it seems it is possible to do).
> > >
> > > --
> > > Mykola Golub
> > > _______________________________________________
> > > Dev mailing list -- dev@xxxxxxx
> > > To unsubscribe send an email to dev-leave@xxxxxxx
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux