Re: Upgraded 16.2.14 to 16.2.15

Zakhar Kirpichenko <zakhar@xxxxxxxxx> · Tue, 5 Mar 2024 10:05:34 +0200

Hi Eugen,

It is correct that I manually added the configuration, but not to the
unit.run but rather to each mon's config (i.e.
/var/lib/ceph/FSID/mon.*/config). I also added it to the cluster config
with "ceph config set mon mon_rocksdb_options", but it seems that this
option doesn't have any effect at all.

/Z

On Tue, 5 Mar 2024 at 09:58, Eugen Block <eblock@xxxxxx> wrote:

> Hi,
>
> > 1. RocksDB options, which I provided to each mon via their configuration
> > files, got overwritten during mon redeployment and I had to re-add
> > mon_rocksdb_options back.
>
> IIRC, you didn't use the extra_entrypoint_args for that option but
> added it directly to the container unit.run file. So it's expected
> that it's removed after an update. If you want it to persist a
> container update you should consider using the extra_entrypoint_args:
>
> cat mon.yaml
> service_type: mon
> service_name: mon
> placement:
>    hosts:
>    - host1
>    - host2
>    - host3
> extra_entrypoint_args:
>    -
>
> '--mon-rocksdb-options=write_buffer_size=33554432,compression=kLZ4Compression,level_compaction_dynamic_level_bytes=true,bottommost_compression=kLZ4HCCompression,max_background_jobs=4,max_subcompactions=2'
>
> Regards,
> Eugen
>
> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:
>
> > Hi,
> >
> > I have upgraded my test and production cephadm-managed clusters from
> > 16.2.14 to 16.2.15. The upgrade was smooth and completed without issues.
> > There were a few things which I noticed after each upgrade:
> >
> > 1. RocksDB options, which I provided to each mon via their configuration
> > files, got overwritten during mon redeployment and I had to re-add
> > mon_rocksdb_options back.
> >
> > 2. Monitor debug_rocksdb option got silently reset back to the default
> 4/5,
> > I had to set it back to 1/5.
> >
> > 3. For roughly 2 hours after the upgrade, despite the clusters being
> > healthy and operating normally, all monitors would run manual compactions
> > very often and write to disks at very high rates. For example, production
> > monitors had their rocksdb:low0 thread write to store.db:
> >
> > monitors without RocksDB compression: ~8 GB/5 min, or ~96 GB/hour;
> > monitors with RocksDB compression: ~1.5 GB/5 min, or ~18 GB/hour.
> >
> > After roughly 2 hours with no changes to the cluster the write rates
> > dropped to ~0.4-0.6 GB/5 min and ~120 MB/5 min respectively. The reason
> for
> > frequent manual compactions and high write rates wasn't immediately
> > apparent.
> >
> > 4. Crash deployment broke ownership of /var/lib/ceph/FSID/crash
> > /var/lib/ceph/FSID/crash/posted, despite I already fixed it manually
> after
> > the upgrade to 16.2.14 which had broken it as well.
> >
> > 5. Mgr RAM usage appears to be increasing at a slower rate than it did
> with
> > 16.2.14, although it's too early to tell whether the issue with mgrs
> > randomly consuming all RAM and getting OOM-killed has been fixed - with
> > 16.2.14 this would normally take several days.
> >
> > Overall, things look good. Thanks to the Ceph team for this release!
> >
> > Zakhar
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx