wip-osd-pg-stats adds a time stamp and epoch number to the pg_stat_t struct. The epoch is updated when the mapping changes, and the time stamp is updated when the pg state changes. The state timestamp is the interesting one, since it'll make it easier to identify PGs that are stuck in, say, down or peering states for long periods of time. What it doesn't help with is when the pg state is toggling between two undesireable states (the stamp will still get updated). In practice, what we probably care about is active vs not active, and degraded vs not degraded. We could add additional time stamps for those, but that may be overkill. Previously we had planned on doing this sort of monitoring using an external agent, but that's kind of a pain, duplicates storage, etc. This is simple to add into the pg_stat_t update logic. I expect this will be rebased on top of the new encoding strategy stuff so that the object versioning is backward and forwards compatible, so don't worry about that part here. https://github.com/NewDreamNetwork/ceph/commit/f43282796af50d760a620970ad691e0a20bcf178 Thoughts? sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html