Re: when is krbd on osd nodes starting to get problematic?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 23, 2021 at 3:36 PM Marc <Marc@xxxxxxxxxxxxxxxxx> wrote:
>
> From what kernel / ceph version is krbd usage on a osd node problematic?
>
> Currently I am running Nautilus 14.2.11 and el7 3.10 kernel without any issues.
>
> I can remember using a cephfs mount without any issues as well, until some specific luminous update surprised me. So maybe nice to know when to expect this.

It has always been the case.  This is a rather fundamental issue and
it is not specific to Ceph.  I don't think there is a particular Ceph
release or kernel version to name other than it became much harder to
hit with modern kernels.

I would be cautious about attributing random stalls or hangs that may
be experienced for a wide variety of reasons to this co-location issue,
even if moving the mount to another machine happened to help.  Usually
such reports lack the necessary evidence, the last one that I could
confirm to be the co-location related lockup was at least a couple of
years ago.

Thanks,

                Ilya

>
>
>
> > -----Original Message-----
> > Sent: Wednesday, 23 June 2021 11:25
> > Subject: *****SPAM*****  Re: Can not mount rbd device
> > anymore
> >
> > On Wed, Jun 23, 2021 at 9:59 AM Matthias Ferdinand
> >  wrote:
> > >
> > > On Tue, Jun 22, 2021 at 02:36:00PM +0200, Ml Ml wrote:
> > > > Hello List,
> > > >
> > > > oversudden i can not mount a specific rbd device anymore:
> > > >
> > > > root@proxmox-backup:~# rbd map backup-proxmox/cluster5 -k
> > > > /etc/ceph/ceph.client.admin.keyring
> > > > /dev/rbd0
> > > >
> > > > root@proxmox-backup:~# mount /dev/rbd0 /mnt/backup-cluster5/
> > > >  (just never times out)
> > >
> > >
> > > Hi,
> > >
> > > there used to be some kernel lock issues when the kernel rbd client
> > > tried to access an OSD on the same machine. Not sure if these issues
> > > still exist (but I would guess so) and if you use your proxmox cluster
> > > in a hyperconverged manner (nodes providing VMs and storage service at
> > > the same time) you may just have been lucky that it had worked before.
> > >
> > > Instead of the kernel client mount you can try to export the volume as
> > > an NBD device (https://docs.ceph.com/en/latest/man/8/rbd-nbd/) and
> > > mounting that. rbd-nbd runs in userspace and should not have that
> > > locking problem.
> >
> > rbd-nbd is also susceptible to locking up in such setups, likely more
> > so than krbd.  Don't forget that it also has a kernel component and
> > there are actually more opportunities for things to go sideways/lock up
> > because there is an extra daemon involved allocating some additional
> > memory for each I/O request.
> >
> > Thanks,
> >
> >                 Ilya
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux