Re: Scrub stuck and 'pg has invalid (post-split) stat'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Eugen for the suggestion, yes we have tried, also repeering
concerned PGs, still the same issue.

Looking at the code it seems the split-mode message is triggered when
the PG as ""stats_invalid": true,", here is the result of a query:

"stats_invalid": true,
                "dirty_stats_invalid": false,
                "omap_stats_invalid": false,
                "hitset_stats_invalid": false,
                "hitset_bytes_stats_invalid": false,
                "pin_stats_invalid": false,
                "manifest_stats_invalid": false,

I also provide again cluster informations that was lost in previous
missed reply all. Don't hesitate to ask more if needed I would be
glade to provide them.

Cédric


On Thu, Feb 22, 2024 at 11:04 AM Eugen Block <eblock@xxxxxx> wrote:
>
> Hm, I wonder if setting (and unsetting after a while) noscrub and
> nodeep-scrub has any effect. Have you tried that?
>
> Zitat von Cedric <yipikai7@xxxxxxxxx>:
>
> > Update: we have run fsck and re-shard on all bluestore volume, seems
> > sharding were not applied.
> >
> > Unfortunately scrubs and deep-scrubs are still stuck on PGs of the
> > pool that is suffering the issue, but other PGs scrubs well.
> >
> > The next step will be to remove the cache tier as suggested, but its
> > not available yet as PGs needs to be scrubbed in order for the cache
> > tier can be activated.
> >
> > As we are struggling to make this cluster works again, any help
> > would be greatly appreciated.
> >
> > Cédric
> >
> >> On 20 Feb 2024, at 20:22, Cedric <yipikai7@xxxxxxxxx> wrote:
> >>
> >> Thanks Eugen, sorry about the missed reply to all.
> >>
> >> The reason we still have the cache tier is because we were not able
> >> to flush all dirty entry to remove it (as per the procedure), so
> >> the cluster as been migrated from HDD/SSD to NVME a while ago but
> >> tiering remains, unfortunately.
> >>
> >> So actually we are trying to understand the root cause
> >>
> >> On Tue, Feb 20, 2024 at 1:43 PM Eugen Block <eblock@xxxxxx> wrote:
> >>>
> >>> Please don't drop the list from your response.
> >>>
> >>> The first question coming to mind is, why do you have a cache-tier if
> >>> all your pools are on nvme decices anyway? I don't see any benefit here.
> >>> Did you try the suggested workaround and disable the cache-tier?
> >>>
> >>> Zitat von Cedric <yipikai7@xxxxxxxxx>:
> >>>
> >>>> Thanks Eugen, see attached infos.
> >>>>
> >>>> Some more details:
> >>>>
> >>>> - commands that actually hangs: ceph balancer status ; rbd -p vms ls ;
> >>>> rados -p vms_cache cache-flush-evict-all
> >>>> - all scrub running on vms_caches pgs are stall / start in a loop
> >>>> without actually doing anything
> >>>> - all io are 0 both from ceph status or iostat on nodes
> >>>>
> >>>> On Tue, Feb 20, 2024 at 10:00 AM Eugen Block <eblock@xxxxxx> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> some more details would be helpful, for example what's the pool size
> >>>>> of the cache pool? Did you issue a PG split before or during the
> >>>>> upgrade? This thread [1] deals with the same problem, the described
> >>>>> workaround was to set hit_set_count to 0 and disable the cache layer
> >>>>> until that is resolved. Afterwards you could enable the cache layer
> >>>>> again. But keep in mind that the code for cache tier is entirely
> >>>>> removed in Reef (IIRC).
> >>>>>
> >>>>> Regards,
> >>>>> Eugen
> >>>>>
> >>>>> [1]
> >>>>> https://ceph-users.ceph.narkive.com/zChyOq5D/ceph-strange-issue-after-adding-a-cache-osd
> >>>>>
> >>>>> Zitat von Cedric <yipikai7@xxxxxxxxx>:
> >>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> Following an upgrade from Nautilus (14.2.22) to Pacific (16.2.13), we
> >>>>>> encounter an issue with a cache pool becoming completely stuck,
> >>>>>> relevant messages below:
> >>>>>>
> >>>>>> pg xx.x has invalid (post-split) stats; must scrub before tier agent
> >>>>>> can activate
> >>>>>>
> >>>>>> In OSD logs, scrubs are starting in a loop without succeeding for all
> >>>>>> pg of this pool.
> >>>>>>
> >>>>>> What we already tried without luck so far:
> >>>>>>
> >>>>>> - shutdown / restart OSD
> >>>>>> - rebalance pg between OSD
> >>>>>> - raise the memory on OSD
> >>>>>> - repeer PG
> >>>>>>
> >>>>>> Any idea what is causing this? any help will be greatly appreciated
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>> Cédric
> >>>>>> _______________________________________________
> >>>>>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>>
> >>>
> >>>
>
>
>
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
nvme   419 TiB  143 TiB  276 TiB   276 TiB      65.78
TOTAL  419 TiB  143 TiB  276 TiB   276 TiB      65.78

--- POOLS ---
POOL                   ID   PGS   STORED   (DATA)   (OMAP)  OBJECTS     USED   (DATA)   (OMAP)  %USED  MAX AVAIL  QUOTA OBJECTS  QUOTA BYTES    DIRTY  USED COMPR  UNDER COMPR
images                  1  1024   35 TiB   35 TiB  954 KiB    4.67M  104 TiB  104 TiB  954 KiB  50.56     34 TiB            N/A          N/A      N/A         0 B          0 B
volumes                 2    32   41 GiB   41 GiB  1.7 KiB   10.76k  124 GiB  124 GiB  1.7 KiB   0.12     34 TiB            N/A          N/A      N/A         0 B          0 B
vms                     3  1024   57 TiB   57 TiB  7.2 MiB    9.41M  170 TiB  170 TiB  7.2 MiB  62.63     34 TiB            N/A          N/A      N/A         0 B          0 B
images_cache           11    32  5.3 MiB  5.1 MiB  196 KiB    5.76k   68 MiB   68 MiB  196 KiB      0     34 TiB            N/A          N/A      N/A         0 B          0 B
vms_cache              12   256  486 GiB  486 GiB  863 KiB  542.07k  1.4 TiB  1.4 TiB  863 KiB   1.39     34 TiB            N/A          N/A  294.57k         0 B          0 B
volumes_cache          13    32  284 KiB  284 KiB    228 B    1.54k   18 MiB   18 MiB    228 B      0     34 TiB            N/A          N/A        9         0 B          0 B
backups                14     8      0 B      0 B      0 B        0      0 B      0 B      0 B      0     34 TiB            N/A          N/A      N/A         0 B          0 B
device_health_metrics  15     1      0 B      0 B      0 B       72      0 B      0 B      0 B      0     34 TiB            N/A          N/A      N/A         0 B          0 B



pool 1 'images' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode off last_change 1564546 lfor 953426/953426/1546859 flags hashpspool,selfmanaged_snaps stripe_width 0 expected_num_objects 1 application rbd
pool 2 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 1560659 lfor 28784/950229/950227 flags hashpspool,selfmanaged_snaps tiers 13 read_tier 13 write_tier 13 stripe_width 0 expected_num_objects 1 application rbd
pool 3 'vms' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode off last_change 1560660 lfor 28785/931603/1546859 flags hashpspool,selfmanaged_snaps tiers 12 read_tier 12 write_tier 12 stripe_width 0 expected_num_objects 1 application rbd
pool 11 'images_cache' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 1560661 lfor 953426/953426/953426 flags hashpspool,incomplete_clones,selfmanaged_snaps stripe_width 0 application rbd
pool 12 'vms_cache' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode off last_change 1565798 lfor 28785/1562901/1564526 flags hashpspool,incomplete_clones,selfmanaged_snaps tier_of 3 cache_mode readproxy target_bytes 1000000000000 target_objects 600000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 7200s x12 decay_rate 0 search_last_n 0 stripe_width 0 application rbd
pool 13 'volumes_cache' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode off last_change 1560663 lfor 28784/952190/952188 flags hashpspool,incomplete_clones,selfmanaged_snaps tier_of 2 cache_mode proxy hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 7200s x12 decay_rate 0 search_last_n 0 stripe_width 0 application rbd
pool 14 'backups' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode off last_change 1560664 flags hashpspool stripe_width 0
pool 15 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode off last_change 1565137 flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth



ID   CLASS  WEIGHT     TYPE NAME                     STATUS  REWEIGHT  PRI-AFF
 -1         419.18036  root default
 -4         419.18036      datacenter EU
-97         209.59074          rack U72
-73          69.86388              host srv8967
112   nvme    5.82199                  osd.112           up   1.00000  1.00000
122   nvme    5.82199                  osd.122           up   1.00000  1.00000
128   nvme    5.82199                  osd.128           up   1.00000  1.00000
136   nvme    5.82198                  osd.136           up   1.00000  1.00000
141   nvme    5.82199                  osd.141           up   1.00000  1.00000
147   nvme    5.82199                  osd.147           up   1.00000  1.00000
153   nvme    5.82199                  osd.153           up   1.00000  1.00000
159   nvme    5.82199                  osd.159           up   1.00000  1.00000
165   nvme    5.82199                  osd.165           up   1.00000  1.00000
171   nvme    5.82199                  osd.171           up   1.00000  1.00000
177   nvme    5.82199                  osd.177           up   1.00000  1.00000
184   nvme    5.82199                  osd.184           up   1.00000  1.00000
-93          69.86388              host srv8968
113   nvme    5.82199                  osd.113           up   1.00000  1.00000
123   nvme    5.82199                  osd.123           up   1.00000  1.00000
131   nvme    5.82199                  osd.131           up   1.00000  1.00000
138   nvme    5.82199                  osd.138           up   1.00000  1.00000
143   nvme    5.82198                  osd.143           up   1.00000  1.00000
149   nvme    5.82199                  osd.149           up   1.00000  1.00000
154   nvme    5.82199                  osd.154           up   1.00000  1.00000
160   nvme    5.82199                  osd.160           up   1.00000  1.00000
166   nvme    5.82199                  osd.166           up   1.00000  1.00000
172   nvme    5.82199                  osd.172           up   1.00000  1.00000
178   nvme    5.82199                  osd.178           up   1.00000  1.00000
186   nvme    5.82199                  osd.186           up   1.00000  1.00000
-77          69.86299              host srv8969
118   nvme    5.82199                  osd.118           up   1.00000  1.00000
126   nvme    5.82199                  osd.126           up   1.00000  1.00000
132   nvme    5.82199                  osd.132           up   1.00000  1.00000
137   nvme    5.82199                  osd.137           up   1.00000  1.00000
144   nvme    5.82199                  osd.144           up   1.00000  1.00000
150   nvme    5.82199                  osd.150           up   1.00000  1.00000
156   nvme    5.82199                  osd.156           up   1.00000  1.00000
161   nvme    5.82199                  osd.161           up   1.00000  1.00000
167   nvme    5.82199                  osd.167           up   1.00000  1.00000
173   nvme    5.82199                  osd.173           up   1.00000  1.00000
179   nvme    5.82199                  osd.179           up   1.00000  1.00000
187   nvme    5.82199                  osd.187           up   1.00000  1.00000
-32         209.58963          rack U74
-89          69.86385              host srv8965
120   nvme    5.82199                  osd.120           up   1.00000  1.00000
125   nvme    5.82199                  osd.125           up   1.00000  1.00000
130   nvme    5.82199                  osd.130           up   1.00000  1.00000
135   nvme    5.82199                  osd.135           up   1.00000  1.00000
142   nvme    5.82199                  osd.142           up   1.00000  1.00000
148   nvme    5.82199                  osd.148           up   1.00000  1.00000
155   nvme    5.82199                  osd.155           up   1.00000  1.00000
162   nvme    5.82198                  osd.162           up   1.00000  1.00000
168   nvme    5.82199                  osd.168           up   1.00000  1.00000
175   nvme    5.82198                  osd.175           up   1.00000  1.00000
181   nvme    5.82199                  osd.181           up   1.00000  1.00000
183   nvme    5.82198                  osd.183           up   1.00000  1.00000
-81          69.86299              host srv8966
119   nvme    5.82199                  osd.119           up   1.00000  1.00000
124   nvme    5.82199                  osd.124           up   1.00000  1.00000
129   nvme    5.82199                  osd.129           up   1.00000  1.00000
134   nvme    5.82199                  osd.134           up   1.00000  1.00000
140   nvme    5.82199                  osd.140           up   1.00000  1.00000
146   nvme    5.82199                  osd.146           up   1.00000  1.00000
152   nvme    5.82199                  osd.152           up   1.00000  1.00000
158   nvme    5.82199                  osd.158           up   1.00000  1.00000
164   nvme    5.82199                  osd.164           up   1.00000  1.00000
170   nvme    5.82199                  osd.170           up   1.00000  1.00000
176   nvme    5.82199                  osd.176           up   1.00000  1.00000
182   nvme    5.82199                  osd.182           up   1.00000  1.00000
-85          69.86279              host srv8970
  0   nvme    5.82190                  osd.0             up   1.00000  1.00000
  1   nvme    5.82190                  osd.1             up   1.00000  1.00000
  2   nvme    5.82190                  osd.2             up   1.00000  1.00000
  3   nvme    5.82190                  osd.3             up   1.00000  1.00000
  4   nvme    5.82190                  osd.4             up   1.00000  1.00000
  5   nvme    5.82190                  osd.5             up   1.00000  1.00000
  6   nvme    5.82190                  osd.6             up   1.00000  1.00000
  7   nvme    5.82190                  osd.7             up   1.00000  1.00000
  8   nvme    5.82190                  osd.8             up   1.00000  1.00000
  9   nvme    5.82190                  osd.9             up   1.00000  1.00000
 10   nvme    5.82190                  osd.10            up   1.00000  1.00000
 11   nvme    5.82190                  osd.11            up   1.00000  1.00000

[
    {
        "rule_id": 0,
        "rule_name": "nvme_rule",
        "ruleset": 0,
        "type": 1,
        "min_size": 1,
        "max_size": 10,
        "steps": [
            {
                "op": "take",
                "item": -72,
                "item_name": "default~nvme"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    },
    {
        "rule_id": 1,
        "rule_name": "hdd_rule",
        "ruleset": 1,
        "type": 1,
        "min_size": 1,
        "max_size": 10,
        "steps": [
            {
                "op": "take",
                "item": -30,
                "item_name": "default~hdd"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    },
    {
        "rule_id": 5,
        "rule_name": "mixed_rule",
        "ruleset": 5,
        "type": 1,
        "min_size": 1,
        "max_size": 10,
        "steps": [
            {
                "op": "take",
                "item": -72,
                "item_name": "default~nvme"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            },
            {
                "op": "take",
                "item": -30,
                "item_name": "default~hdd"
            },
            {
                "op": "chooseleaf_firstn",
                "num": 0,
                "type": "host"
            },
            {
                "op": "emit"
            }
        ]
    }
]

ID   CLASS  WEIGHT     TYPE NAME
 -1         419.18036  root default
 -4         419.18036      datacenter EU
-97         209.59074          rack U72
-73          69.86388              host srv8967
112   nvme    5.82199                  osd.112
122   nvme    5.82199                  osd.122
128   nvme    5.82199                  osd.128
136   nvme    5.82198                  osd.136
141   nvme    5.82199                  osd.141
147   nvme    5.82199                  osd.147
153   nvme    5.82199                  osd.153
159   nvme    5.82199                  osd.159
165   nvme    5.82199                  osd.165
171   nvme    5.82199                  osd.171
177   nvme    5.82199                  osd.177
184   nvme    5.82199                  osd.184
-93          69.86388              host srv8968
113   nvme    5.82199                  osd.113
123   nvme    5.82199                  osd.123
131   nvme    5.82199                  osd.131
138   nvme    5.82199                  osd.138
143   nvme    5.82198                  osd.143
149   nvme    5.82199                  osd.149
154   nvme    5.82199                  osd.154
160   nvme    5.82199                  osd.160
166   nvme    5.82199                  osd.166
172   nvme    5.82199                  osd.172
178   nvme    5.82199                  osd.178
186   nvme    5.82199                  osd.186
-77          69.86299              host srv8969
118   nvme    5.82199                  osd.118
126   nvme    5.82199                  osd.126
132   nvme    5.82199                  osd.132
137   nvme    5.82199                  osd.137
144   nvme    5.82199                  osd.144
150   nvme    5.82199                  osd.150
156   nvme    5.82199                  osd.156
161   nvme    5.82199                  osd.161
167   nvme    5.82199                  osd.167
173   nvme    5.82199                  osd.173
179   nvme    5.82199                  osd.179
187   nvme    5.82199                  osd.187
-32         209.58963          rack U74
-89          69.86385              host srv8965
120   nvme    5.82199                  osd.120
125   nvme    5.82199                  osd.125
130   nvme    5.82199                  osd.130
135   nvme    5.82199                  osd.135
142   nvme    5.82199                  osd.142
148   nvme    5.82199                  osd.148
155   nvme    5.82199                  osd.155
162   nvme    5.82198                  osd.162
168   nvme    5.82199                  osd.168
175   nvme    5.82198                  osd.175
181   nvme    5.82199                  osd.181
183   nvme    5.82198                  osd.183
-81          69.86299              host srv8966
119   nvme    5.82199                  osd.119
124   nvme    5.82199                  osd.124
129   nvme    5.82199                  osd.129
134   nvme    5.82199                  osd.134
140   nvme    5.82199                  osd.140
146   nvme    5.82199                  osd.146
152   nvme    5.82199                  osd.152
158   nvme    5.82199                  osd.158
164   nvme    5.82199                  osd.164
170   nvme    5.82199                  osd.170
176   nvme    5.82199                  osd.176
182   nvme    5.82199                  osd.182
-85          69.86279              host srv8970
  0   nvme    5.82190                  osd.0
  1   nvme    5.82190                  osd.1
  2   nvme    5.82190                  osd.2
  3   nvme    5.82190                  osd.3
  4   nvme    5.82190                  osd.4
  5   nvme    5.82190                  osd.5
  6   nvme    5.82190                  osd.6
  7   nvme    5.82190                  osd.7
  8   nvme    5.82190                  osd.8
  9   nvme    5.82190                  osd.9
 10   nvme    5.82190                  osd.10
 11   nvme    5.82190                  osd.11

ceph osd tier cache-mode
Invalid command: missing required parameter pool(<poolname>)
osd tier cache-mode <pool> writeback|readproxy|readonly|none [--yes-i-really-mean-it] :  specify the caching mode for cache tier <pool>






_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux