Re: wip-pg-stale

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 27, 2012 at 1:32 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> Please review.
>
> If the monitor sees an osdmap go by where nodes go down (or up) it will
> scan its pg_map and mark any pg whose primary is down as 'stale'.  If/when
> the pg recovers, that will get refreshed.  If not, the admin will know
> something is up.
Hmm. Without any kind of timeout this flag will get set every time an
OSD goes down — the replicas won't alert the new primary until after
they get the map marking their old primary down, and this check will
be run synchronously with the generation of the map marking the OSD
down.
The "spurious" stale marker on each PG isn't a big deal (it'll
disappear after a few seconds), but if we're going to set HEALTH_WARN
based on it, that seems like a bit much to me.

> We'll soon be adding the last_active, last_clean, and now last_unstale (?)
> fields so that bigger alarms can go off when the pg stays stale for more
> than a few seconds...
Yeah; I think we want to use this to trigger big warnings, but not to
trigger warnings without it.
-Greg


>
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux