Are the scrubs eventually reported as "scrub ok" in the OSD logs? How
long do the scrubs take? Do you see updated timestamps in the 'ceph pg
dump' output (column DEEP_SCRUB_STAMP)?
Zitat von thymus_03fumbler@xxxxxxxxxx:
I recently switched from 16.2.x to 18.2.x and migrated to cephadm,
since the switch the cluster is constantly scrubbing, 24/7 up to 50
PGs simultaneously and up to 20 deep scrubs simultaneously in a
cluster that has only 12 (in use) OSDs.
Furthermore it still manages to regularly have a warning with ‘pgs
not scrubbed in time’
I have tried various settings, like osd_deep_scrub_interval,
osd_max_scrubs, mds_max_scrub_ops_in_progress etc.
All those get ignored.
Please advice.
Here is an output of ceos config dump:
WHO MASK LEVEL OPTION
VALUE
RO
global advanced auth_client_required
cephx
*
global advanced auth_cluster_required
cephx
*
global advanced auth_service_required
cephx
*
global advanced auth_supported
cephx
*
global basic container_image
quay.io/ceph/ceph@sha256:aca35483144ab3548a7f670db9b79772e6fc51167246421c66c0bd56a6585468
*
global basic device_failure_prediction_mode
local
global advanced mon_allow_pool_delete true
global advanced mon_data_avail_warn 20
global advanced mon_max_pg_per_osd 400
global advanced osd_max_pg_per_osd_hard_ratio
10.000000
global advanced osd_pool_default_pg_autoscale_mode on
mon advanced auth_allow_insecure_global_id_reclaim
false
mon advanced mon_crush_min_required_version
firefly
*
mon advanced mon_warn_on_pool_no_redundancy
false
mon advanced public_network
10.79.0.0/16
*
mgr advanced mgr/balancer/active true
mgr advanced mgr/balancer/mode
upmap
mgr advanced
mgr/cephadm/manage_etc_ceph_ceph_conf_hosts label:admin
*
mgr advanced mgr/cephadm/migration_current
6
*
mgr advanced mgr/dashboard/GRAFANA_API_PASSWORD
admin
*
mgr advanced mgr/dashboard/GRAFANA_API_SSL_VERIFY
false
*
mgr advanced mgr/dashboard/GRAFANA_API_URL
https://10.79.79.12:3000
*
mgr advanced mgr/dashboard/PROMETHEUS_API_HOST
http://10.79.79.12:9095
*
mgr advanced mgr/devicehealth/enable_monitoring true
mgr advanced mgr/orchestrator/orchestrator
cephadm
osd advanced osd_map_cache_size 250
osd advanced osd_map_share_max_epochs 50
osd advanced osd_mclock_profile
high_client_ops
osd advanced osd_pg_epoch_persisted_max_stale 50
osd.0 basic osd_mclock_max_capacity_iops_hdd
380.869888
osd.1 basic osd_mclock_max_capacity_iops_hdd
441.000000
osd.10 basic osd_mclock_max_capacity_iops_ssd
13677.906485
osd.11 basic osd_mclock_max_capacity_iops_hdd
274.411212
osd.13 basic osd_mclock_max_capacity_iops_hdd
198.492501
osd.2 basic osd_mclock_max_capacity_iops_hdd
251.592009
osd.3 basic osd_mclock_max_capacity_iops_hdd
208.197434
osd.4 basic osd_mclock_max_capacity_iops_hdd
196.544082
osd.5 basic osd_mclock_max_capacity_iops_ssd
12739.225456
osd.6 basic osd_mclock_max_capacity_iops_hdd
211.288660
osd.7 basic osd_mclock_max_capacity_iops_hdd
210.543236
osd.8 basic osd_mclock_max_capacity_iops_hdd
242.241594
osd.9 basic osd_mclock_max_capacity_iops_hdd
559.933780
mds.plexfs basic mds_join_fs
plexfs
Here is a ceph -s output
services:
mon: 3 daemons, quorum
lxt-prod-ceph-util02,lxt-prod-ceph-util01,lxt-prod-ceph-util03 (age
3w)
mgr: lxt-prod-ceph-util02.iyrhxj(active, since 3w), standbys:
lxt-prod-ceph-util03.wvstpe
mds: 1/1 daemons up
osd: 14 osds: 14 up (since 4w), 14 in (since 4w)
data:
volumes: 1/1 healthy
pools: 4 pools, 193 pgs
objects: 14.48M objects, 52 TiB
usage: 71 TiB used, 39 TiB / 110 TiB avail
pgs: 131 active+clean
47 active+clean+scrubbing
15 active+clean+scrubbing+deep
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx