Re: rbd-nbd crashes Error: failed to read nbd request header: (33) Numerical argument out of domain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 24, 2021 at 11:43 AM Yanhu Cao <gmayyyha@xxxxxxxxx> wrote:
>
> Any progress on this? We have encountered the same problem, use the
> rbd-nbd option timeout=120.
> ceph version: 14.2.13
> kernel version: 4.19.118-2+deb10u1

Hi Yanhu,

No, we still don't know what is causing this.

If rbd-nbd is being too slow, perhaps disabling the timeout would help?
Starting with kernel 5.4, "--io-timeout 0" should do it.

In general, the nbd driver is pretty unstable in older kernels.
Timeout handling is just one example so I would advise upgrading
to a recent kernel, e.g. 5.10 LTS.

Thanks,

                Ilya

>
> On Wed, May 19, 2021 at 10:55 PM Mykola Golub <to.my.trociny@xxxxxxxxx> wrote:
> >
> > On Wed, May 19, 2021 at 11:32:04AM +0800, Zhi Zhang wrote:
> > > On Wed, May 19, 2021 at 11:19 AM Zhi Zhang <zhang.david2011@xxxxxxxxx>
> > > wrote:
> > >
> > > >
> > > > On Tue, May 18, 2021 at 10:58 PM Mykola Golub <to.my.trociny@xxxxxxxxx>
> > > > wrote:
> > > > >
> > > > > Could you please provide the full rbd-nbd log? If it is too large for
> > > > > the attachment then may be via some public url?
> > > >
> > > >  ceph.rbd-client.log.bz2
> > > > <https://drive.google.com/file/d/1TuiGOrVAgKIJ3BUmiokG0cU12fnlQ3GR/view?usp=drive_web>
> > > >
> > > > I uploaded it to google driver. Pls check it out.
> > >
> > > We found the reader_entry thread got zero byte when trying to read the nbd
> > > request header, then rbd-nbd exited and closed the socket. But we haven't
> > > figured out why read zero byte?
> >
> > Ok. I was hoping to find some hint in the log, why the read from the
> > kernel could return without data, but I don't see it.
> >
> > From experience it could happen when the rbd-nbd got stack or was too
> > slow so the kernel failed after timeout, but it looked different in
> > the logs AFAIR. Anyway you can try increasing the timeout using
> > rbd-nbd --timeout (--io-timeout in newer versions) option. The default
> > is 30 sec.
> >
> > If it does not help, probably you will find a clue increasing the
> > kernel debug level for nbd (it seems it is possible to do).
> >
> > --
> > Mykola Golub
> > _______________________________________________
> > Dev mailing list -- dev@xxxxxxx
> > To unsubscribe send an email to dev-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux