Re: Stray monitor

Eugen Block <eblock@xxxxxx> · Mon, 18 Nov 2024 15:10:59 +0000

Hi,

just to be safe, did you fail the mgr? If not, try 'ceph mgr fail' and  
see if it still reports that information. It sounds like you didn't  
clean up your virtual MON after you drained the OSDs. Or how exactly  
did you drain that host? If you run 'ceph orch host drain {host}' the  
orchestrator will remove all daemons, not only OSDs. I assume that  
there are still some entries in the ceph config-key database where the  
'device ls-by-host' output comes from, I haven't verified that though.  
Is this key present?

ceph config-key get mgr/cephadm/host.cephfs-cluster-node-2 | jq

Is the VM still reachable? If it is, check the MON directory under  
/var/lib/ceph/{FSID}/mon.X and remove it. You can also first check  
with 'cephadm ls --no-detail' if cephadm believes that there's a MON  
daemon located. Remove the directory so cephadm forgets about it.

Zitat von Jakub Daniel <jakub.daniel@xxxxxxxxx>:

I have found these two threads

https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/VHZ7IJ7PAL7L2INLSHNVYY7V7ZCXD46G/#TSWERUMAEEGZPSYXG6PSS4YMRXPP3L63

https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/NG5QVRTVCLLYNLK56CSYLIPE4WBFXS5U/#HJDBAJFX27KATC4WV2MKGLVGLN2HTWWD

but I didn't exactly figure out what to do. I've since found remnants of
removed cephfs-cluster-node-{0,1,2} in crush map buckets. Which I removed
with no effect on health detail. I found out that ceph dashboard lists the
non-existent cephfs-cluster-node-2 among ceph hosts (while orch host ls
doesn't). On the other hand, device ls-by-host cephfs-cluster-node-2 lists
an entry with device name coinciding with a drive on the live X host, with
daemons listing the mon.X. Meanwhile device ls-by-host X lists among other
things the exact same entry with the difference that in the dev column
there is the actual device nvme0n1 whereas with cephfs-cluster-node-2 the
column was empty

root@X:~# cephadm shell -- ceph device ls-by-host cephfs-cluster-node-2
DEVICE                                   DEV  DAEMONS  EXPECTED FAILURE
Samsung_SSD_970_PRO_1TB_S462NF0M310269L       mon.X

root@X:~# cephadm shell -- ceph device ls-by-host X
DEVICE                                     DEV      DAEMONS
EXPECTED FAILURE
Samsung_SSD_850_EVO_250GB_S2R6NX0J423123P  sdb      osd.3
Samsung_SSD_860_EVO_1TB_S3Z9NY0M431048H             mon.Y
Samsung_SSD_970_PRO_1TB_S462NF0M310269L    nvme0n1  mon.X

This output is confusing since it lists mon.Y also when asked for host X.

I will continue investigating. If anyone has any hints what to try or where
to look, I would be very grateful.

Jakub

On Sun, 17 Nov 2024, 15:34 Tim Holloway, <timh@xxxxxxxxxxxxx> wrote:

I think I can count 5 sources that Ceph can query to
report/display/control its resources.

1. The /dec/ceph/ceph.conf file. Mostly supplanted bt the Ceph
configuration database.

2. The ceph configuration database. A namelesskey/value store internal
to a ceph filesystem. It's distributed (no fixed location), accessed by
Ceph commands and APIs/

3. Legacy Ceph resources/ Stuff found under a host's /var/lib/ceph
directory.

4. Managed Ceph resources. Stuff found under a host's
/var/lib/ceph/{fsid} diirectory.

5. The live machine state of Ceph. Since this not only can vary from
host to host, but also service to service, I don't think that this is
considered to be an authoritative source of information.

Compounding this is that current releases of Ceph can all too easily
end up in a "forbidden" state where you may have, for example a legacy
OSD.6 and a managed OSD.6 on the same host. In such a case, system is
generally operable, but functionally corrupt and ideally should be
corrected to remove the redundant resource.

The real issue is that depending on what Ceph interface you're querying
(or "ceph health" is querying!), you don't always get your answer from
a single authoritative source, so you'll get conflicting results and
annoying error reports. The "stray daemon" condition is an especially
egregious example of this, and it's not only possible because of a
false detection from one of the above sources, but also, I think can
come from "dead" daemons being referenced in CRUSH.

You might want to run through this lists's history for "phantom host"
postings made by me back around this past June because I was absolutely
plagued with them. Eugen Block helped me eventually purge them all.

  Regards,
     Tim

On Sat, 2024-11-16 at 21:42 +0100, Jakub Daniel wrote:
Hello,

I'm pretty new to ceph deployment. I have setup my first cephfs
cluster
using cephadm. Initially, I deployed ceph in 3 virtualbox instances
that I
called cephfs-cluster-node-{0, 1, 2} just to test things. Later, I
added 5
more real hardware nodes. Later I decided I'd remove the
virtualboxes, so I
drained the osds and removed the hosts. Suddenly, ceph status detail
started reporting

HEALTH_WARN 1 stray host(s) with 1 daemon(s) not managed by cephadm
[WRN]
CEPHADM_STRAY_HOST: 1 stray host(s) with 1 daemon(s) not managed by
cephadm
    stray host cephfs-cluster-node-2 has 1 stray daemons: ['mon.X']

The cephfs-cluster-node-2 is no longer listed among hosts, it is (and
has
been for tens of hours) offline (powered down). The mon.X doesn't
even
belong to that node, it is one of the real hardware nodes. I am
unaware of
mon.X ever running on cephfs-cluster-node-2 (never noticed it among
systemd
units).

Where does cephadm shell -- ceph status detail come to the conclusion
there
is something stray? How can I address this?

Thank you for any insights
Jakub
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx