Thanks, I'll have to see if I come up with a suitable issue on documentation. My biggest issue isn't a specific item (well, except for Octopus telling me to use the not-included ceph-deploy command in lots of places). It's more a case of needing attention paid to anachronisms in general. That and more attention could be paid to the distinction between container-based and OS-native Ceph components. So in short, not single issues, but more of a need for attention to the overall details to assure that features described for a specific release actually apply TO that release. Grunt work, but it can save a lot on service calls. I migrated to ceph from gluster because gluster is apparently going unsupported at the end of this year. I moved to gluster from DR/BD because I wanted triple redundancy on the data. While ceph is really kind of overkill for my small R&D farm, it has proven to be about the most solid network distributed filesystem I've worked with, No split brains, no outright corruption, no data outages. Despite all the atrocities I committed in setting it up, it has never failed at it primary duty of delivering data service. I started off with Octopus, and that has been the root of a lot of my problems. Octopus introduced cephadm as a primary management tool, I believe, but the documentation still referenced ceph-deploy. And cephadm suffered from a bug that meant that if even one service was down, scheduled work would not be done, so to repair anything I needed an already-repaired system. Migrating to Pacific cleared that up so a lot of what I'm doing now is getting the lint out. I'm now staying consistently healthy between a proper monitor configuration and having removed direct ceph mounts on the desktops. I very much appreciate all the help and insights you've provided. It's nice to have laid my problems to rest. Tim On Thu, 2024-02-08 at 14:41 +0000, Eugen Block wrote: > Hi, > > you're always welcome to report a documentation issue on > tracker.ceph.com, you don't need to clean them up by yourself. :-) > There is a major restructuring in progress, but they will probably > never be perfect anyway. > > > There are definitely some warts om there, as the monitor count was > > 1 > > but there were 2 monitors listed running. > > I don't know your mon history, but I assume that you've had more > than > one mon (before converting to cephadm?). Then you might have updated > the mon specs via command line, containing "count:1". But the mgr > refuses to remove the second mon because it would break quorum. > That's > why you had 2/1 running, this is reproducible in my test cluster. > Adding more mons also failed because of the count:1 spec. You could > have just overwritten it in the cli as well without a yaml spec file > (omit the count spec): > > ceph orch apply mon --placement="host1,host2,host3" > > Regards, > Eugen > > Zitat von Tim Holloway <timh@xxxxxxxxxxxxx>: > > > Ah, yes. Much better. > > > > There are definitely some warts om there, as the monitor count was > > 1 > > but there were 2 monitors listed running. > > > > I've mostly avoided docs that reference ceph config files and yaml > > configs because the online docs are (as I've whined before) not > > always > > trustworthy and often contain anachronisms. Were I sufficiently > > knowledgeable, I'd offer to clean them up, but if that were the > > case, I > > wouldn't have to come crying here. > > > > All happy now, though. > > > > Tim > > > > > > On Tue, 2024-02-06 at 19:22 +0000, Eugen Block wrote: > > > Yeah, you have the „count:1“ in there, that’s why your manually > > > added > > > daemons are rejected. Try my suggestion with a mon.yaml. > > > > > > Zitat von Tim Holloway <timh@xxxxxxxxxxxxx>: > > > > > > > ceph orch ls > > > > NAME PORTS RUNNING > > > > REFRESHED > > > > AGE > > > > PLACEMENT > > > > alertmanager ?:9093,9094 1/1 3m > > > > ago > > > > 8M > > > > count:1 > > > > crash 5/5 3m > > > > ago > > > > 8M > > > > * > > > > grafana ?:3000 1/1 3m > > > > ago > > > > 8M > > > > count:1 > > > > mds.ceefs 2/2 3m > > > > ago > > > > 4M > > > > count:2 > > > > mds.fs_name 3/3 3m > > > > ago > > > > 8M > > > > count:3 > > > > mgr 3/3 3m > > > > ago > > > > 4M > > > > www6.mousetech.com;www2.mousetech.com;www7.mousetech.com > > > > mon 2/1 3m > > > > ago > > > > 4M > > > > www6.mousetech.com;www2.mousetech.com;www7.mousetech.com;count: > > > > 1 > > > > nfs.foo ?:2049 1/1 3m > > > > ago > > > > 4M > > > > www7.mousetech.com > > > > node-exporter ?:9100 5/5 3m > > > > ago > > > > 8M > > > > * > > > > osd 6 3m > > > > ago > > > > - > > > > <unmanaged> > > > > osd.dashboard-admin-1686941775231 0 - > > > > > > > > 7M > > > > * > > > > prometheus ?:9095 1/1 3m > > > > ago > > > > 8M > > > > count:1 > > > > rgw.mousetech ?:80 2/2 3m > > > > ago > > > > 3M > > > > www7.mousetech.com;www2.mousetech.com > > > > > > > > > > > > Note that the dell02 monitor doesn't show here although the > > > > "ceph > > > > orch > > > > deamon add" returns success initially. And actually the www6 > > > > monitor is > > > > not running nor does it list on the dashboard or "ceph orch > > > > ps". > > > > The > > > > www6 machine is still somewhat messed up because it was the > > > > initial > > > > launch machine for Octopus. > > > > > > > > On Tue, 2024-02-06 at 17:22 +0000, Eugen Block wrote: > > > > > So the orchestrator is working and you have a working ceph > > > > > cluster? > > > > > Can you share the output of: > > > > > ceph orch ls mon > > > > > > > > > > If the orchestrator expects only one mon and you deploy > > > > > another > > > > > manually via daemon add it can be removed. Try using a > > > > > mon.yaml > > > > > file > > > > > instead which contains the designated mon hosts and then run > > > > > ceph orch apply -I mon.yaml > > > > > > > > > > > > > > > > > > > > Zitat von Tim Holloway <timh@xxxxxxxxxxxxx>: > > > > > > > > > > > I just jacked in a completely new, clean server and I've > > > > > > been > > > > > > trying to > > > > > > get a Ceph (Pacific) monitor running on it. > > > > > > > > > > > > The "ceph orch daemon add" appears to install all/most of > > > > > > what's > > > > > > necessary, but when the monitor starts, it shuts down > > > > > > immediately, > > > > > > and > > > > > > in the manner of Ceph containers immediately erases itself > > > > > > and > > > > > > the > > > > > > container log, so it's not possible to see what its problem > > > > > > is. > > > > > > > > > > > > I looked at manual installation, but the docs appear to be > > > > > > oriented > > > > > > towards old-style non-container implementation and don't > > > > > > account > > > > > > for > > > > > > the newer /var/lib/ceph/*fsid*/ approach. > > > > > > > > > > > > Any tips? > > > > > > > > > > > > Last few lines in the system journal are like this: > > > > > > > > > > > > Feb 06 11:09:58 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:58.938+0000 > > > > > > 7f26810ae700 4 rocksdb: (Original Log Time 2024/02/06- > > > > > > 16:09:58.938432) > > > > > > [compaction/compaction_job.cc:760] [default] compacted to: > > > > > > base > > > > > > level 6 > > > > > > level multiplier 10.00 max bytes base 268435456 files[0 0 0 > > > > > > 0 0 > > > > > > 0 > > > > > > 2] > > > > > > max score 0.00, MB/sec: 351.7 rd, 351.7 wr, level 6, files > > > > > > in(4, 0) > > > > > > out(2) MB in(92.8, 0.0) out(92.8), read-write-amplify(2.0) > > > > > > write- > > > > > > amplify(1.0) OK, records in: 2858, records dropped: 0 > > > > > > output_compression: NoCompression > > > > > > Feb 06 11:09:58 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: > > > > > > Feb 06 11:09:58 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:58.938+0000 > > > > > > 7f26810ae700 4 rocksdb: (Original Log Time 2024/02/06- > > > > > > 16:09:58.938452) > > > > > > EVENT_LOG_v1 {"time_micros": 1707235798938446, "job": 6, > > > > > > "event": > > > > > > "compaction_finished", "compaction_time_micros": 276718, > > > > > > "compaction_time_cpu_micros": 73663, "output_level": 6, > > > > > > "num_output_files": 2, "total_output_size": 97309398, > > > > > > "num_input_records": 2858, "num_output_records": 2858, > > > > > > "num_subcompactions": 1, "output_compression": > > > > > > "NoCompression", > > > > > > "num_single_delete_mismatches": 0, > > > > > > "num_single_delete_fallthrough": > > > > > > 0, > > > > > > "lsm_state": [0, 0, 0, 0, 0, 0, 2]} > > > > > > Feb 06 11:09:58 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:58.940+0000 > > > > > > 7f26810ae700 4 rocksdb: EVENT_LOG_v1 {"time_micros": > > > > > > 1707235798941291, > > > > > > "job": 6, "event": "table_file_deletion", "file_number": > > > > > > 14} > > > > > > Feb 06 11:09:58 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:58.943+0000 > > > > > > 7f26810ae700 4 rocksdb: EVENT_LOG_v1 {"time_micros": > > > > > > 1707235798943980, > > > > > > "job": 6, "event": "table_file_deletion", "file_number": > > > > > > 12} > > > > > > Feb 06 11:09:58 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:58.946+0000 > > > > > > 7f26810ae700 4 rocksdb: EVENT_LOG_v1 {"time_micros": > > > > > > 1707235798946734, > > > > > > "job": 6, "event": "table_file_deletion", "file_number": > > > > > > 10} > > > > > > Feb 06 11:09:58 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:58.946+0000 > > > > > > 7f26810ae700 4 rocksdb: EVENT_LOG_v1 {"time_micros": > > > > > > 1707235798946789, > > > > > > "job": 6, "event": "table_file_deletion", "file_number": 4} > > > > > > Feb 06 11:09:59 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:59.450+0000 > > > > > > 7f26818af700 -1 received signal: Terminated from Kernel ( > > > > > > Could be > > > > > > generated by pthread_kill(), raise(), abort(), alarm() ) > > > > > > UID: 0 > > > > > > Feb 06 11:09:59 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:59.450+0000 > > > > > > 7f26818af700 -1 mon.dell02@-1(synchronizing) e161 *** Got > > > > > > Signal > > > > > > Terminated *** > > > > > > Feb 06 11:09:59 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:59.450+0000 > > > > > > 7f26818af700 1 mon.dell02@-1(synchronizing) e161 shutdown > > > > > > Feb 06 11:09:59 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:59.452+0000 > > > > > > 7f2691a95880 4 rocksdb: [db_impl/db_impl.cc:397] Shutdown: > > > > > > canceling > > > > > > all background work > > > > > > Feb 06 11:09:59 dell02.mousetech.com ceph-278fcd86-0861- > > > > > > 11ee- > > > > > > a7df- > > > > > > 9c5c8e86cf8f-mon-dell02[1357545]: debug 2024-02- > > > > > > 06T16:09:59.452+0000 > > > > > > 7f2691a95880 4 rocksdb: [db_impl/db_impl.cc:573] Shutdown > > > > > > complete > > > > > > Feb 06 11:09:59 dell02.mousetech.com bash[1357898]: ceph- > > > > > > 278fcd86- > > > > > > 0861- > > > > > > 11ee-a7df-9c5c8e86cf8f-mon-dell02 > > > > > > _______________________________________________ > > > > > > ceph-users mailing list -- ceph-users@xxxxxxx > > > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > > > > > > > > > > > _______________________________________________ > > > > > ceph-users mailing list -- ceph-users@xxxxxxx > > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx