Re: Misleading error (osd has already bound to class) when starting osd on nautilus?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Forwarding here in case anyone is seeing the same/similar issue, Amit gave
really good pointers and a workaround :)


Thanks Amit!


On 11/25 16:08, Amit Ghadge wrote:
> Yes, and if you want avoid in future update this flag to 0 by $echo 0 >
> /sys/block/sdx/queue/rotational
> 
> Thanks
> 
> On Wed, Nov 25, 2020 at 4:03 PM David Caro <dcaro@xxxxxxxxxxxxx> wrote:
> 
> >
> > Yep, you are right:
> >
> > ```
> > # cat /sys/block/sdd/queue/rotational
> > 1
> > ```
> >
> > I was looking to the code too but you got there before me :)
> >
> > https://github.com/ceph/ceph/blob/25ac1528419371686740412616145703810a561f/src/common/blkdev.cc#L222
> >
> >
> > It might be an issue with the driver then reporting the wrong data. I'll
> > look
> > into it.
> >
> > Do you mind if I reply on the list with this info? (or if you want you
> > reply)
> > I think this might help others too (and myself in the future xd)
> >
> > Thanks Amit!
> >
> > On 11/25 15:50, Amit Ghadge wrote:
> > > This might happen when the disk default sets 1
> > > in /sys/block/sdx/queue/rotational , 1 for HDD and 0 for SSD, But we not
> > > see any problem till now.
> > >
> > > -AmitG
> > >
> > > On Wed, Nov 25, 2020 at 3:08 PM David Caro <dcaro@xxxxxxxxxxxxx> wrote:
> > >
> > > >
> > > > Hi!
> > > >
> > > > I have a nautilus ceph cluster, and today I restarted one of the osd
> > > > daemons
> > > > and spend some time trying to debug an error I was seeing in the log,
> > > > though it
> > > > seems the osd is actually working.
> > > >
> > > >
> > > > The error I was seeing is:
> > > > ```
> > > > Nov 25 09:07:43 osd15 systemd[1]: Starting Ceph object storage daemon
> > > > osd.44...
> > > > Nov 25 09:07:43 osd15 systemd[1]: Started Ceph object storage daemon
> > > > osd.44.
> > > > Nov 25 09:07:47 osd15 ceph-osd[12230]: 2020-11-25 09:07:47.846
> > > > 7f55395fbc80 -1 osd.44 106947 log_to_monitors {default=true}
> > > > Nov 25 09:07:47 osd15 ceph-osd[12230]: 2020-11-25 09:07:47.850
> > > > 7f55395fbc80 -1 osd.44 106947 mon_cmd_maybe_osd_create fail: 'osd.44
> > has
> > > > already bound to class 'ssd', can not reset class to 'hdd'; use 'ceph
> > osd
> > > > crush rm-device-class <id>' to remove old class first': (16) Device or
> > > > resource busy
> > > > ```
> > > >
> > > > There's no other messages in the journal so at first I thought that
> > the osd
> > > > failed to start.
> > > > But it seems to be up and working correctly anyhow.
> > > >
> > > > There's no "hdd" class in my crush map:
> > > > ```
> > > > # ceph osd crush class ls
> > > > [
> > > >     "ssd"
> > > > ]
> > > > ```
> > > >
> > > > And that osd is actually of the correct class:
> > > > ```
> > > > # ceph osd crush get-device-class osd.44
> > > > ssd
> > > > ```
> > > >
> > > > ```
> > > > # uname -a
> > > > Linux osd15 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1
> > (2020-06-07)
> > > > x86_64 GNU/Linux
> > > >
> > > > # ceph --version
> > > > ceph version 14.2.5-1-g23e76c7aa6
> > > > (23e76c7aa6e15817ffb6741aafbc95ca99f24cbb) nautilus (stable)
> > > > ```
> > > >
> > > > The osd shows up in the cluster and it's receiving load, so there
> > seems to
> > > > be
> > > > no problem, but does anyone know what that error is about?
> > > >
> > > >
> > > > Thanks!
> > > >
> > > >
> > > > --
> > > > David Caro
> > > > SRE - Cloud Services
> > > > Wikimedia Foundation <https://wikimediafoundation.org/>
> > > > PGP Signature: 7180 83A2 AC8B 314F B4CE  1171 4071 C7E1 D262 69C3
> > > >
> > > > "Imagine a world in which every single human being can freely share in
> > the
> > > > sum of all knowledge. That's our commitment."
> > > > _______________________________________________
> > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > > >
> >
> > --
> > David Caro
> > SRE - Cloud Services
> > Wikimedia Foundation <https://wikimediafoundation.org/>
> > PGP Signature: 7180 83A2 AC8B 314F B4CE  1171 4071 C7E1 D262 69C3
> >
> > "Imagine a world in which every single human being can freely share in the
> > sum of all knowledge. That's our commitment."
> >

-- 
David Caro
SRE - Cloud Services
Wikimedia Foundation <https://wikimediafoundation.org/>
PGP Signature: 7180 83A2 AC8B 314F B4CE  1171 4071 C7E1 D262 69C3

"Imagine a world in which every single human being can freely share in the
sum of all knowledge. That's our commitment."

Attachment: signature.asc
Description: PGP signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux