Thanks Eugen for the suggestion, yes we have tried, also repeering concerned PGs, still the same issue. Looking at the code it seems the split-mode message is triggered when the PG as ""stats_invalid": true,", here is the result of a query: "stats_invalid": true, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, I also provide again cluster informations that was lost in previous missed reply all. Don't hesitate to ask more if needed I would be glade to provide them. Cédric On Thu, Feb 22, 2024 at 11:04 AM Eugen Block <eblock@xxxxxx> wrote: > > Hm, I wonder if setting (and unsetting after a while) noscrub and > nodeep-scrub has any effect. Have you tried that? > > Zitat von Cedric <yipikai7@xxxxxxxxx>: > > > Update: we have run fsck and re-shard on all bluestore volume, seems > > sharding were not applied. > > > > Unfortunately scrubs and deep-scrubs are still stuck on PGs of the > > pool that is suffering the issue, but other PGs scrubs well. > > > > The next step will be to remove the cache tier as suggested, but its > > not available yet as PGs needs to be scrubbed in order for the cache > > tier can be activated. > > > > As we are struggling to make this cluster works again, any help > > would be greatly appreciated. > > > > Cédric > > > >> On 20 Feb 2024, at 20:22, Cedric <yipikai7@xxxxxxxxx> wrote: > >> > >> Thanks Eugen, sorry about the missed reply to all. > >> > >> The reason we still have the cache tier is because we were not able > >> to flush all dirty entry to remove it (as per the procedure), so > >> the cluster as been migrated from HDD/SSD to NVME a while ago but > >> tiering remains, unfortunately. > >> > >> So actually we are trying to understand the root cause > >> > >> On Tue, Feb 20, 2024 at 1:43 PM Eugen Block <eblock@xxxxxx> wrote: > >>> > >>> Please don't drop the list from your response. > >>> > >>> The first question coming to mind is, why do you have a cache-tier if > >>> all your pools are on nvme decices anyway? I don't see any benefit here. > >>> Did you try the suggested workaround and disable the cache-tier? > >>> > >>> Zitat von Cedric <yipikai7@xxxxxxxxx>: > >>> > >>>> Thanks Eugen, see attached infos. > >>>> > >>>> Some more details: > >>>> > >>>> - commands that actually hangs: ceph balancer status ; rbd -p vms ls ; > >>>> rados -p vms_cache cache-flush-evict-all > >>>> - all scrub running on vms_caches pgs are stall / start in a loop > >>>> without actually doing anything > >>>> - all io are 0 both from ceph status or iostat on nodes > >>>> > >>>> On Tue, Feb 20, 2024 at 10:00 AM Eugen Block <eblock@xxxxxx> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> some more details would be helpful, for example what's the pool size > >>>>> of the cache pool? Did you issue a PG split before or during the > >>>>> upgrade? This thread [1] deals with the same problem, the described > >>>>> workaround was to set hit_set_count to 0 and disable the cache layer > >>>>> until that is resolved. Afterwards you could enable the cache layer > >>>>> again. But keep in mind that the code for cache tier is entirely > >>>>> removed in Reef (IIRC). > >>>>> > >>>>> Regards, > >>>>> Eugen > >>>>> > >>>>> [1] > >>>>> https://ceph-users.ceph.narkive.com/zChyOq5D/ceph-strange-issue-after-adding-a-cache-osd > >>>>> > >>>>> Zitat von Cedric <yipikai7@xxxxxxxxx>: > >>>>> > >>>>>> Hello, > >>>>>> > >>>>>> Following an upgrade from Nautilus (14.2.22) to Pacific (16.2.13), we > >>>>>> encounter an issue with a cache pool becoming completely stuck, > >>>>>> relevant messages below: > >>>>>> > >>>>>> pg xx.x has invalid (post-split) stats; must scrub before tier agent > >>>>>> can activate > >>>>>> > >>>>>> In OSD logs, scrubs are starting in a loop without succeeding for all > >>>>>> pg of this pool. > >>>>>> > >>>>>> What we already tried without luck so far: > >>>>>> > >>>>>> - shutdown / restart OSD > >>>>>> - rebalance pg between OSD > >>>>>> - raise the memory on OSD > >>>>>> - repeer PG > >>>>>> > >>>>>> Any idea what is causing this? any help will be greatly appreciated > >>>>>> > >>>>>> Thanks > >>>>>> > >>>>>> Cédric > >>>>>> _______________________________________________ > >>>>>> ceph-users mailing list -- ceph-users@xxxxxxx > >>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> ceph-users mailing list -- ceph-users@xxxxxxx > >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >>> > >>> > >>> > > >
--- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED nvme 419 TiB 143 TiB 276 TiB 276 TiB 65.78 TOTAL 419 TiB 143 TiB 276 TiB 276 TiB 65.78 --- POOLS --- POOL ID PGS STORED (DATA) (OMAP) OBJECTS USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR images 1 1024 35 TiB 35 TiB 954 KiB 4.67M 104 TiB 104 TiB 954 KiB 50.56 34 TiB N/A N/A N/A 0 B 0 B volumes 2 32 41 GiB 41 GiB 1.7 KiB 10.76k 124 GiB 124 GiB 1.7 KiB 0.12 34 TiB N/A N/A N/A 0 B 0 B vms 3 1024 57 TiB 57 TiB 7.2 MiB 9.41M 170 TiB 170 TiB 7.2 MiB 62.63 34 TiB N/A N/A N/A 0 B 0 B images_cache 11 32 5.3 MiB 5.1 MiB 196 KiB 5.76k 68 MiB 68 MiB 196 KiB 0 34 TiB N/A N/A N/A 0 B 0 B vms_cache 12 256 486 GiB 486 GiB 863 KiB 542.07k 1.4 TiB 1.4 TiB 863 KiB 1.39 34 TiB N/A N/A 294.57k 0 B 0 B volumes_cache 13 32 284 KiB 284 KiB 228 B 1.54k 18 MiB 18 MiB 228 B 0 34 TiB N/A N/A 9 0 B 0 B backups 14 8 0 B 0 B 0 B 0 0 B 0 B 0 B 0 34 TiB N/A N/A N/A 0 B 0 B device_health_metrics 15 1 0 B 0 B 0 B 72 0 B 0 B 0 B 0 34 TiB N/A N/A N/A 0 B 0 B pool 1 'images' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode off last_change 1564546 lfor 953426/953426/1546859 flags hashpspool,selfmanaged_snaps stripe_width 0 expected_num_objects 1 application rbd pool 2 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 1560659 lfor 28784/950229/950227 flags hashpspool,selfmanaged_snaps tiers 13 read_tier 13 write_tier 13 stripe_width 0 expected_num_objects 1 application rbd pool 3 'vms' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode off last_change 1560660 lfor 28785/931603/1546859 flags hashpspool,selfmanaged_snaps tiers 12 read_tier 12 write_tier 12 stripe_width 0 expected_num_objects 1 application rbd pool 11 'images_cache' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 1560661 lfor 953426/953426/953426 flags hashpspool,incomplete_clones,selfmanaged_snaps stripe_width 0 application rbd pool 12 'vms_cache' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode off last_change 1565798 lfor 28785/1562901/1564526 flags hashpspool,incomplete_clones,selfmanaged_snaps tier_of 3 cache_mode readproxy target_bytes 1000000000000 target_objects 600000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 7200s x12 decay_rate 0 search_last_n 0 stripe_width 0 application rbd pool 13 'volumes_cache' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 1560663 lfor 28784/952190/952188 flags hashpspool,incomplete_clones,selfmanaged_snaps tier_of 2 cache_mode proxy hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 7200s x12 decay_rate 0 search_last_n 0 stripe_width 0 application rbd pool 14 'backups' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode off last_change 1560664 flags hashpspool stripe_width 0 pool 15 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode off last_change 1565137 flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 419.18036 root default -4 419.18036 datacenter EU -97 209.59074 rack U72 -73 69.86388 host srv8967 112 nvme 5.82199 osd.112 up 1.00000 1.00000 122 nvme 5.82199 osd.122 up 1.00000 1.00000 128 nvme 5.82199 osd.128 up 1.00000 1.00000 136 nvme 5.82198 osd.136 up 1.00000 1.00000 141 nvme 5.82199 osd.141 up 1.00000 1.00000 147 nvme 5.82199 osd.147 up 1.00000 1.00000 153 nvme 5.82199 osd.153 up 1.00000 1.00000 159 nvme 5.82199 osd.159 up 1.00000 1.00000 165 nvme 5.82199 osd.165 up 1.00000 1.00000 171 nvme 5.82199 osd.171 up 1.00000 1.00000 177 nvme 5.82199 osd.177 up 1.00000 1.00000 184 nvme 5.82199 osd.184 up 1.00000 1.00000 -93 69.86388 host srv8968 113 nvme 5.82199 osd.113 up 1.00000 1.00000 123 nvme 5.82199 osd.123 up 1.00000 1.00000 131 nvme 5.82199 osd.131 up 1.00000 1.00000 138 nvme 5.82199 osd.138 up 1.00000 1.00000 143 nvme 5.82198 osd.143 up 1.00000 1.00000 149 nvme 5.82199 osd.149 up 1.00000 1.00000 154 nvme 5.82199 osd.154 up 1.00000 1.00000 160 nvme 5.82199 osd.160 up 1.00000 1.00000 166 nvme 5.82199 osd.166 up 1.00000 1.00000 172 nvme 5.82199 osd.172 up 1.00000 1.00000 178 nvme 5.82199 osd.178 up 1.00000 1.00000 186 nvme 5.82199 osd.186 up 1.00000 1.00000 -77 69.86299 host srv8969 118 nvme 5.82199 osd.118 up 1.00000 1.00000 126 nvme 5.82199 osd.126 up 1.00000 1.00000 132 nvme 5.82199 osd.132 up 1.00000 1.00000 137 nvme 5.82199 osd.137 up 1.00000 1.00000 144 nvme 5.82199 osd.144 up 1.00000 1.00000 150 nvme 5.82199 osd.150 up 1.00000 1.00000 156 nvme 5.82199 osd.156 up 1.00000 1.00000 161 nvme 5.82199 osd.161 up 1.00000 1.00000 167 nvme 5.82199 osd.167 up 1.00000 1.00000 173 nvme 5.82199 osd.173 up 1.00000 1.00000 179 nvme 5.82199 osd.179 up 1.00000 1.00000 187 nvme 5.82199 osd.187 up 1.00000 1.00000 -32 209.58963 rack U74 -89 69.86385 host srv8965 120 nvme 5.82199 osd.120 up 1.00000 1.00000 125 nvme 5.82199 osd.125 up 1.00000 1.00000 130 nvme 5.82199 osd.130 up 1.00000 1.00000 135 nvme 5.82199 osd.135 up 1.00000 1.00000 142 nvme 5.82199 osd.142 up 1.00000 1.00000 148 nvme 5.82199 osd.148 up 1.00000 1.00000 155 nvme 5.82199 osd.155 up 1.00000 1.00000 162 nvme 5.82198 osd.162 up 1.00000 1.00000 168 nvme 5.82199 osd.168 up 1.00000 1.00000 175 nvme 5.82198 osd.175 up 1.00000 1.00000 181 nvme 5.82199 osd.181 up 1.00000 1.00000 183 nvme 5.82198 osd.183 up 1.00000 1.00000 -81 69.86299 host srv8966 119 nvme 5.82199 osd.119 up 1.00000 1.00000 124 nvme 5.82199 osd.124 up 1.00000 1.00000 129 nvme 5.82199 osd.129 up 1.00000 1.00000 134 nvme 5.82199 osd.134 up 1.00000 1.00000 140 nvme 5.82199 osd.140 up 1.00000 1.00000 146 nvme 5.82199 osd.146 up 1.00000 1.00000 152 nvme 5.82199 osd.152 up 1.00000 1.00000 158 nvme 5.82199 osd.158 up 1.00000 1.00000 164 nvme 5.82199 osd.164 up 1.00000 1.00000 170 nvme 5.82199 osd.170 up 1.00000 1.00000 176 nvme 5.82199 osd.176 up 1.00000 1.00000 182 nvme 5.82199 osd.182 up 1.00000 1.00000 -85 69.86279 host srv8970 0 nvme 5.82190 osd.0 up 1.00000 1.00000 1 nvme 5.82190 osd.1 up 1.00000 1.00000 2 nvme 5.82190 osd.2 up 1.00000 1.00000 3 nvme 5.82190 osd.3 up 1.00000 1.00000 4 nvme 5.82190 osd.4 up 1.00000 1.00000 5 nvme 5.82190 osd.5 up 1.00000 1.00000 6 nvme 5.82190 osd.6 up 1.00000 1.00000 7 nvme 5.82190 osd.7 up 1.00000 1.00000 8 nvme 5.82190 osd.8 up 1.00000 1.00000 9 nvme 5.82190 osd.9 up 1.00000 1.00000 10 nvme 5.82190 osd.10 up 1.00000 1.00000 11 nvme 5.82190 osd.11 up 1.00000 1.00000 [ { "rule_id": 0, "rule_name": "nvme_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -72, "item_name": "default~nvme" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 1, "rule_name": "hdd_rule", "ruleset": 1, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -30, "item_name": "default~hdd" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 5, "rule_name": "mixed_rule", "ruleset": 5, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -72, "item_name": "default~nvme" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" }, { "op": "take", "item": -30, "item_name": "default~hdd" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] } ] ID CLASS WEIGHT TYPE NAME -1 419.18036 root default -4 419.18036 datacenter EU -97 209.59074 rack U72 -73 69.86388 host srv8967 112 nvme 5.82199 osd.112 122 nvme 5.82199 osd.122 128 nvme 5.82199 osd.128 136 nvme 5.82198 osd.136 141 nvme 5.82199 osd.141 147 nvme 5.82199 osd.147 153 nvme 5.82199 osd.153 159 nvme 5.82199 osd.159 165 nvme 5.82199 osd.165 171 nvme 5.82199 osd.171 177 nvme 5.82199 osd.177 184 nvme 5.82199 osd.184 -93 69.86388 host srv8968 113 nvme 5.82199 osd.113 123 nvme 5.82199 osd.123 131 nvme 5.82199 osd.131 138 nvme 5.82199 osd.138 143 nvme 5.82198 osd.143 149 nvme 5.82199 osd.149 154 nvme 5.82199 osd.154 160 nvme 5.82199 osd.160 166 nvme 5.82199 osd.166 172 nvme 5.82199 osd.172 178 nvme 5.82199 osd.178 186 nvme 5.82199 osd.186 -77 69.86299 host srv8969 118 nvme 5.82199 osd.118 126 nvme 5.82199 osd.126 132 nvme 5.82199 osd.132 137 nvme 5.82199 osd.137 144 nvme 5.82199 osd.144 150 nvme 5.82199 osd.150 156 nvme 5.82199 osd.156 161 nvme 5.82199 osd.161 167 nvme 5.82199 osd.167 173 nvme 5.82199 osd.173 179 nvme 5.82199 osd.179 187 nvme 5.82199 osd.187 -32 209.58963 rack U74 -89 69.86385 host srv8965 120 nvme 5.82199 osd.120 125 nvme 5.82199 osd.125 130 nvme 5.82199 osd.130 135 nvme 5.82199 osd.135 142 nvme 5.82199 osd.142 148 nvme 5.82199 osd.148 155 nvme 5.82199 osd.155 162 nvme 5.82198 osd.162 168 nvme 5.82199 osd.168 175 nvme 5.82198 osd.175 181 nvme 5.82199 osd.181 183 nvme 5.82198 osd.183 -81 69.86299 host srv8966 119 nvme 5.82199 osd.119 124 nvme 5.82199 osd.124 129 nvme 5.82199 osd.129 134 nvme 5.82199 osd.134 140 nvme 5.82199 osd.140 146 nvme 5.82199 osd.146 152 nvme 5.82199 osd.152 158 nvme 5.82199 osd.158 164 nvme 5.82199 osd.164 170 nvme 5.82199 osd.170 176 nvme 5.82199 osd.176 182 nvme 5.82199 osd.182 -85 69.86279 host srv8970 0 nvme 5.82190 osd.0 1 nvme 5.82190 osd.1 2 nvme 5.82190 osd.2 3 nvme 5.82190 osd.3 4 nvme 5.82190 osd.4 5 nvme 5.82190 osd.5 6 nvme 5.82190 osd.6 7 nvme 5.82190 osd.7 8 nvme 5.82190 osd.8 9 nvme 5.82190 osd.9 10 nvme 5.82190 osd.10 11 nvme 5.82190 osd.11 ceph osd tier cache-mode Invalid command: missing required parameter pool(<poolname>) osd tier cache-mode <pool> writeback|readproxy|readonly|none [--yes-i-really-mean-it] : specify the caching mode for cache tier <pool>
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx