Sage,
Below are the state changes I'm seeing before a crash due to a
RecoveryDone event while in RepWaitBackfillReserved. All these state
changes are happening in rapid succession. Isn't this too much
preemption? The crash should be fixed no matter the preemption issue.
From http://tracker.ceph.com/issues/22902
state<Started/ReplicaActive>: Activate Finished
do_peering_event: epoch_sent: 131 epoch_requested: 131 MInfoRec from 4
info: 2.1c( v 126'439 (26'100,126'439] local-lis/les=130/131 n=37
ec=118/18 lis/c 130/118 les/c/f 131/119/0 130/130/125)
do_peering_event: epoch_sent: 131 epoch_requested: 131
RequestBackfillPrio: priority 100
exit Started/ReplicaActive/RepNotRecovering 0.018110 2 0.000097
enter Started/ReplicaActive/RepWaitBackfillReserved
do_peering_event: epoch_sent: 131 epoch_requested: 131
RemoteBackfillReserved
exit Started/ReplicaActive/RepWaitBackfillReserved 0.000136 1 0.000070
enter Started/ReplicaActive/RepRecovering
do_peering_event: epoch_sent: 131 epoch_requested: 131
RemoteBackfillPreempted
do_peering_event: epoch_sent: 131 epoch_requested: 131
RemoteReservationCanceled
exit Started/ReplicaActive/RepRecovering 0.028659 2 0.000067
enter Started/ReplicaActive/RepNotRecovering
do_peering_event: epoch_sent: 131 epoch_requested: 131
RequestBackfillPrio: priority 100
exit Started/ReplicaActive/RepNotRecovering 0.000867 1 0.000032
enter Started/ReplicaActive/RepWaitBackfillReserved
do_peering_event: epoch_sent: 131 epoch_requested: 131
RemoteBackfillReserved
exit Started/ReplicaActive/RepWaitBackfillReserved 0.281558 1 0.000049
enter Started/ReplicaActive/RepRecovering
do_peering_event: epoch_sent: 131 epoch_requested: 131
RemoteBackfillPreempted
do_peering_event: epoch_sent: 131 epoch_requested: 131
RemoteReservationCanceled
exit Started/ReplicaActive/RepRecovering 0.001055 2 0.000046
enter Started/ReplicaActive/RepNotRecovering
do_peering_event: epoch_sent: 131 epoch_requested: 131
RequestBackfillPrio: priority 100
exit Started/ReplicaActive/RepNotRecovering 0.000264 1 0.000027
enter Started/ReplicaActive/RepWaitBackfillReserved
do_peering_event: epoch_sent: 131 epoch_requested: 131
RemoteBackfillReserved
exit Started/ReplicaActive/RepWaitBackfillReserved 0.334811 1 0.000021
enter Started/ReplicaActive/RepRecovering
do_peering_event: epoch_sent: 131 epoch_requested: 131
RemoteBackfillPreempted
do_peering_event: epoch_sent: 131 epoch_requested: 131
RemoteReservationCanceled
exit Started/ReplicaActive/RepRecovering 0.003034 2 0.000138
enter Started/ReplicaActive/RepNotRecovering
do_peering_event: epoch_sent: 131 epoch_requested: 131
RequestBackfillPrio: priority 100
exit Started/ReplicaActive/RepNotRecovering 0.000043 1 0.000029
enter Started/ReplicaActive/RepWaitBackfillReserved
do_peering_event: epoch_sent: 131 epoch_requested: 131 RecoveryDone
exit Started/ReplicaActive/RepWaitBackfillReserved 0.000179 1 0.000030
exit Started/ReplicaActive 0.668887 0 0.000000
exit Started 1.679752 0 0.000000
enter Crashed
David
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html