Ceph not warning about clock skew on an OSD-only host?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Our production cluster runs Luminous.

Yesterday, one of our OSD-only hosts came up with its clock about 8 hours wrong(!) having been out of the cluster for a week or so. Initially, ceph seemed entirely happy, and then after an hour or so it all went South (OSDs start logging about bad authenticators, I/O pauses, general sadness).

I know clock sync is important to Ceph, so "one system is 8 hours out, Ceph becomes sad" is not a surprise. It is perhaps a surprise that the OSDs were allowed in at all...

What _is_ a surprise, though, is that at no point in all this did Ceph raise a peep about clock skew. Normally it's pretty sensitive to this - our test cluster has had clock skew complaints when a mon is only slightly out, and here we had a node 8 hours wrong.

Is there some oddity like Ceph not warning on clock skew for OSD-only hosts? or an upper bound on how high a discrepency it will WARN about?

Regards,

Matthew

example output from mid-outage:

root@sto-3-1:~#  ceph -s
  cluster:
    id:     049fc780-8998-45a8-be12-d3b8b6f30e69
    health: HEALTH_ERR
            40755436/2702185683 objects misplaced (1.508%)
            Reduced data availability: 20 pgs inactive, 20 pgs peering
Degraded data redundancy: 367431/2702185683 objects degraded (0.014%), 4549 pgs degraded 481 slow requests are blocked > 32 sec. Implicated osds 188,284,795,1278,1981,2061,2648,2697 644 stuck requests are blocked > 4096 sec. Implicated osds 22,31,33,35,101,116,120,130,132,140,150,159,201,211,228,263,327,541,561,566,585,589,636,643,649,654,743,785,790,806,865,1037,1040,1090,1100,1104,1115,1134,1135,1166,1193,1275,1277,1292,1494,1523,1598,1638,1746,2055,2069,2191,2210,2358,2399,2486,2487,2562,2589,2613,2627,2656,2713,2720,2837,2839,2863,2888,2908,2920,2928,2929,2947,2948,2963,2969,2972

[...]


--
The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. _______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux