On Wed, Oct 4, 2017 at 9:08 AM, Piotr Dałek <piotr.dalek@xxxxxxxxxxxx> wrote: > On 17-10-04 08:51 AM, lists wrote: >> >> Hi, >> >> Yesterday I chowned our /var/lib/ceph ceph, to completely finalize our >> jewel migration, and noticed something interesting. >> >> After I brought back up the OSDs I just chowned, the system had some >> recovery to do. During that recovery, the system went to HEALTH_ERR for a >> short moment: >> >> See below, for consecutive ceph -s outputs: >> >>> [..] >>> root@pm2:~# ceph -s >>> cluster 1397f1dc-7d94-43ea-ab12-8f8792eee9c1 >>> health HEALTH_ERR >>> 2 pgs are stuck inactive for more than 300 seconds > > > ^^ that. > >>> 761 pgs degraded >>> 2 pgs recovering >>> 181 pgs recovery_wait >>> 2 pgs stuck inactive >>> 273 pgs stuck unclean >>> 543 pgs undersized >>> recovery 1394085/8384166 objects degraded (16.628%) >>> 4/24 in osds are down >>> noout flag(s) set >>> monmap e3: 3 mons at >>> {0=10.10.89.1:6789/0,1=10.10.89.2:6789/0,2=10.10.89.3:6789/0} >>> election epoch 256, quorum 0,1,2 0,1,2 >>> osdmap e10230: 24 osds: 20 up, 24 in; 543 remapped pgs >>> flags noout,sortbitwise,require_jewel_osds >>> pgmap v36531146: 1088 pgs, 2 pools, 10703 GB data, 2729 kobjects >>> 32724 GB used, 56656 GB / 89380 GB avail >>> 1394085/8384166 objects degraded (16.628%) >>> 543 active+undersized+degraded >>> 310 active+clean >>> 181 active+recovery_wait+degraded >>> 26 active+degraded >>> 13 active >>> 9 activating+degraded >>> 4 activating >>> 2 active+recovering+degraded >>> recovery io 133 MB/s, 37 objects/s >>> client io 64936 B/s rd, 9935 kB/s wr, 0 op/s rd, 942 op/s wr >>> [..] >> >> It was only very briefly, but it did worry me a bit, fortunately, we went >> back to the expected HEALTH_WARN very quickly, and everything finished fine, >> so I guess nothing to worry. >> >> But I'm curious: can anyone explain WHY we got a brief HEALTH_ERR? >> >> No smart errors, apply and commit latency are all within the expected >> ranges, the systems basically is healthy. >> >> Curious :-) > > > Since Jewel (AFAIR), when (re)starting OSDs, pg status is reset to "never > contacted", resulting in "pgs are stuck inactive for more than 300 seconds" > being reported until osds regain connections between themselves. > Also, the last_active state isn't updated very regularly, as far as I can tell. On our cluster I have increased this timeout --mon_pg_stuck_threshold: 1800 (Which helps suppress these bogus HEALTH_ERR's) -- dan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com