Re: Stuck in remapped state?

Tim Holloway <timh@xxxxxxxxxxxxx> · Sat, 27 Jul 2024 15:52:29 -0400

Solved, I think.

https://docs.ceph.com/en/latest/cephfs/eviction/

I're reached the point where I've seen the docs before, but when I go
to find them, instead I get the out-of-date stuff. And by out-of-date I
mean things like processes that when out with Octopus but still the
first point of entry for the Reef docs. Frustrating.

On Sat, 2024-07-27 at 11:44 -0400, Tim Holloway wrote:
> Update on "ceph-s". A machine was in the process of crashing when I
> took the original snapshot. Here it is after the reboot:
> 
> [root@dell02 ~]# ceph -s
>   cluster:
>     id:     278fcd86-0861-11ee-a7df-9c5c8e86cf8f
>     health: HEALTH_WARN
>             1 filesystem is degraded
>             25 client(s) laggy due to laggy OSDs
>  
>   services:
>     mon: 3 daemons, quorum dell02,www7,ceph03 (age 8m)
>     mgr: ceph08.tlocfi(active, since 81m), standbys: www7.rxagfn,
> dell02.odtbqw
>     mds: 1/1 daemons up, 2 standby
>     osd: 7 osds: 7 up (since 12h), 7 in (since 19h); 308 remapped pgs
>     rgw: 2 daemons active (2 hosts, 1 zones)
>  
>   data:
>     volumes: 0/1 healthy, 1 recovering
>     pools:   22 pools, 681 pgs
>     objects: 125.10k objects, 36 GiB
>     usage:   91 GiB used, 759 GiB / 850 GiB avail
>     pgs:     47772/369076 objects misplaced (12.944%)
>              373 active+clean
>              308 active+clean+remapped
>  
>   io:
>     client:   170 B/s rd, 0 op/s rd, 0 op/s wr
> 
> On Sat, 2024-07-27 at 11:31 -0400, Tim Holloway wrote:
> > I was in the  middle of tuning my OSDs when lightning blew me off
> > the
> > Internet. Had to wait 5 days for my ISP to send a tech and replace
> > a
> > fried cable. In the mean time, among other things. I had some
> > serious
> > time drift between servers thanks to the OS upgrades replacing NTP
> > with
> > chrony and me not having thought to re-establish a master in-house
> > timeserver.
> > 
> > Ceph tried really hard to keep up with all that, but eventually it
> > was
> > just too much. Now I've got an offline filesystem and apparently
> > it's
> > stuck trying to get back online again.
> > 
> > The forensics:
> > [ceph: root@www7 /]# ceph -s
> >   cluster:
> >     id:     278fcd86-0861-11ee-a7df-9c5c8e86cf8f
> >     health: HEALTH_WARN
> >             failed to probe daemons or devices
> >             1 filesystem is degraded
> >             1/3 mons down, quorum www7,ceph03
> >  
> >   services:
> >     mon: 3 daemons, quorum www7,ceph03 (age 2m), out of quorum:
> > dell02
> >     mgr: ceph08.tlocfi(active, since 58m), standbys: dell02.odtbqw,
> > www7.rxagfn
> >     mds: 1/1 daemons up, 1 standby
> >     osd: 7 osds: 7 up (since 12h), 7 in (since 18h); 308 remapped
> > pgs
> >     rgw: 2 daemons active (2 hosts, 1 zones)
> >  
> >   data:
> >     volumes: 0/1 healthy, 1 recovering
> >     pools:   22 pools, 681 pgs
> >     objects: 125.10k objects, 36 GiB
> >     usage:   91 GiB used, 759 GiB / 850 GiB avail
> >     pgs:     47772/369076 objects misplaced (12.944%)
> >              373 active+clean
> >              308 active+clean+remapped
> >  
> >   io:
> >     client:   170 B/s rd, 0 op/s rd, 0 op/s wr
> > 
> > [ceph: root@www7 /]# ceph health detail
> > HEALTH_WARN 1 filesystem is degraded; 25 client(s) laggy due to
> > laggy
> > OSDs
> > [WRN] FS_DEGRADED: 1 filesystem is degraded
> >     fs ceefs is degraded
> > [WRN] MDS_CLIENTS_LAGGY: 25 client(s) laggy due to laggy OSDs
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14019719 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14124385 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14144243 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14144375 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14224103 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14224523 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14234194 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14234545 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14236841 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14237837 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14238536 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14244124 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14264236 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14266870 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14294170 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14294434 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14296012 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14304212 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14316057 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14318379 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14325518 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14328956 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14334283 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14336104 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> >     mds.ceefs.www7.drnuyi(mds.0): Client 14374237 is laggy; not
> > evicted
> > because some OSD(s) is/are laggy
> > 
> > [ceph: root@www7 /]# ceph osd tree
> > ID   CLASS  WEIGHT   TYPE NAME        STATUS  REWEIGHT  PRI-AFF
> >  -1         2.79994  root default                              
> > -25         0.15999      host ceph01                           
> >   1    hdd  0.15999          osd.1        up   0.15999  1.00000
> > -28         1.15999      host ceph03                           
> >   3    hdd  0.15999          osd.3        up   0.15999  1.00000
> >   5    hdd  1.00000          osd.5        up   1.00000  1.00000
> >  -9         0.15999      host ceph06                           
> >   2    hdd  0.15999          osd.2        up   0.15999  1.00000
> >  -3         0.15999      host ceph07                           
> >   6    hdd  0.15999          osd.6        up   0.15999  1.00000
> >  -6         1.00000      host ceph08                           
> >   4    hdd  1.00000          osd.4        up   1.00000  1.00000
> >  -7         0.15999      host www7                             
> >   0    hdd  0.15999          osd.0        up   0.15999  1.00000
> > 
> > [ceph: root@www7 /]# ceph pg stat
> > 681 pgs: 373 active+clean, 308 active+clean+remapped; 36 GiB data,
> > 91
> > GiB used, 759 GiB / 850 GiB avail; 255 B/s rd, 0 op/s; 47772/369073
> > objects misplaced (12.944%)
> > 
> > 
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx