Re: "cephadm version" in reef returns "AttributeError: 'CephadmContext' object has no attribute 'fsid'"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

first of all, I'd still recommend to use the orchestrator to deploy OSDs. Building OSDs manually and then adopt them is redundant. Or do you have issues with the drivegroups? I don't have *the* solution but you could try to disable the mclock scheduler [1] which is the default since Quincy. Maybe that will speed up things? There have been reports in the list about some unwanted or at least unexpected behavior. As for the "not (deep-)scrubbed in time" messages, there seems to be progress (in your ceph status), but depending on the drive utilization you could increase the number of scrubs per OSD (osd_max_scrubs).

Regards,
Eugen

[1] https://www.clyso.com/blog/ceph-how-do-disable-mclock-scheduler/

Zitat von Martin Conway <martin.conway@xxxxxxxxxx>:

Sorry for the delayed response. Office365 keeps quarantining 75% of the emails from this list.

I had a couple of other issues, one of which seems to have come good by itself.

cephadm adopt --style legacy --name osd.12

Would fail, but I tried it again yesterday without having made any other changes I can think of, and it worked fine. Sorry, I don't have a record of what the error message was.

I have been trying to add SSD DB/WAL to all of my Bluestor OSD spinning disks. The commands I found to simply move the DB didn't work for me so I have been removing each OSD and recreating again using ceph-colume, then adopting again. This process has been going on for weeks, and I now have a persistent issue with scrubs not having been completed in time. I am unsure if this will sort itself out if I just leave it alone for a week or two, or if there is an underlying issue with Reef and scrubbing being slow. It definitely doesn't fix itself if left alone for a day or two.

I find that backfilling and possibly scrubbing often comes to a halt for no apparent reason. If I put a server into maintenance mode or kill and restart OSDs it bursts back into life again.

Not sure how to diagnose why the recovery processes have stalled.

Regards,
Martin

-----Original Message-----
From: John Mulligan <phlogistonjohn@xxxxxxxxxxxxx>
Sent: Saturday, October 28, 2023 12:58 AM
To: ceph-users@xxxxxxx
Subject:  Re: "cephadm version" in reef returns "AttributeError:
'CephadmContext' object has no attribute 'fsid'"

[You don't often get email from phlogistonjohn@xxxxxxxxxxxxx. Learn why
this is important at https://aka.ms/LearnAboutSenderIdentification ]

On Friday, October 27, 2023 2:40:17 AM EDT Eugen Block wrote:
> Are the issues you refer to the same as before? I don't think this
> version issue is the root cause, I do see it as well in my test
> cluster(s) but the rest works properly except for the tag issue I
> already reported which you can easily fix by setting the config value
> for the default image
> (https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flis
> ts.ceph.io%2Fhyperkitty%2Flist%2Fceph-
users%40ceph.io%2Fthread%2FLASBJ
>
CSPFGD&data=05%7C01%7Cmartin.conway%40anu.edu.au%7Cbc11333aca
854f478a0
>
308dbd6f4d47e%7Ce37d725cab5c46249ae5f0533e486437%7C0%7C0%7C
63834310466
>
8081112%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
V2luMzIiLC
>
JBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iaeVqSYY8
%2B6a8mqZ1
> hgGmGif5%2BAjrWgXqeuhGaRKy7Q%3D&reserved=0
> YAWPVE2YLV2ZLF3HC5SLS/#LASBJCSPFGDYAWPVE2YLV2ZLF3HC5SLS). Or
are there
> new issues you encountered?


I concur. That `cephadm version` failure is/was a known issue but should not
be the cause of any other issues.  On the main branch `cephadm version` no
longer fails this way - rather, it reports the version of a cephadm build and no
longer inspects a container image.  We can look into backporting this before
the next reef release.

The issue related to the container image tag that Eugen filed has also been
fixed on reef. Thanks for filing that.

Martin you may want to retry things after the next reef release.
Unfortunately, I don't know when that is planned but I think it's soonish.

>
> Zitat von Martin Conway <martin.conway@xxxxxxxxxx>:
> > I just had another look through the issues tracker and found this
> > bug already listed.
> > https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftr
> >
acker.ceph.com%2Fissues%2F59428&data=05%7C01%7Cmartin.conway%40
anu.e
> >
du.au%7Cbc11333aca854f478a0308dbd6f4d47e%7Ce37d725cab5c46249a
e5f0533
> >
e486437%7C0%7C0%7C638343104668081112%7CUnknown%7CTWFpbGZ
sb3d8eyJWIjo
> >
iMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
000%7
> >
C%7C%7C&sdata=kYRi9R4hSa5CPErQEct5MRUftPG%2FpFcLHu30%2FfpWUQ
w%3D&res
> > erved=0
> >
> > I need to go back to the other issues I am having and figure out if
> > they are related or something different.
> >
> >
> > Hi
> >
> > I wrote before about issues I was having with cephadm in 18.2.0
> > Sorry, I didn't see the helpful replies because my mail service
> > binned the responses.
> >
> > I still can't get the reef version of cephadm to work properly.
> >
> > I had updated the system rpm to reef (ceph repo) and also upgraded
> > the containerised ceph daemons to reef before my first email.
> >
> > Both the system package cephadm and the one found at
> > /var/lib/ceph/${fsid}/cephadm.* return the same error when running
> > "cephadm version"
> >
> > Traceback (most recent call last):
> >   File
> >
> >
"./cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc1
05c
> > afd9b4
> > e", line 9468, in <module>
> >
> >     main()
> >
> >   File
> >
> >
"./cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc1
05c
> > afd9b4
> > e", line 9456, in main
> >
> >     r = ctx.func(ctx)
> >
> >   File
> >
> >
"./cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc1
05c
> > afd9b4
> > e", line 2108, in _infer_image
> >
> >     ctx.image = infer_local_ceph_image(ctx,
> > ctx.container_engine.path)
> >
> >   File
> >
> >
"./cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc1
05c
> > afd9b4 e", line 2191, in infer_local_ceph_image
> >
> >     container_info = get_container_info(ctx, daemon, daemon_name is not
> >     None)
> >
> >   File
> >
> >
"./cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc1
05c
> > afd9b4 e", line 2154, in get_container_info
> >
> >     matching_daemons = [d for d in daemons if daemon_name_or_type(d)
> >
> > == daemon_filter and d['fsid'] == ctx.fsid]
> >
> >   File
> >
> >
"./cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc1
05c
> > afd9b4
> > e", line 2154, in <listcomp>
> >
> >     matching_daemons = [d for d in daemons if daemon_name_or_type(d)
> >
> > == daemon_filter and d['fsid'] == ctx.fsid]
> >
> >   File
> >
> >
"./cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc1
05c
> > afd9b4
> > e", line 217, in __getattr__
> >
> >     return super().__getattribute__(name)
> >
> > AttributeError: 'CephadmContext' object has no attribute 'fsid'
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> > email to ceph-users-leave@xxxxxxx
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to
ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux