Re: assert in can_discard_replica_op

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 15, 2017 at 5:42 PM, sheng qiu <herbert1984106@xxxxxxxxx> wrote:
> Hi,
>
> recently, we got an assert in function can_discard_replica_op() when
> osd is handling replica op reply. The assert is caused by
> get_down_at() which checks if the source osd is still exists(),
> otherwise it assert.
>
> seems in our testing environment, the source osd send an op reply to
> primary osd and then died.
>
> My question should we first check exists() and avoid the assert happen
> in get_down_at() or it's expected to be always exists() at this
> situation.

An OSD existing is just making sure it is in the OSDMap at all (it
doesn't need to be up or in). If you've managed to get an OSD sending
ops during an epoch where it doesn't exist, something has gone
terribly wrong — the local assert is not the problem! We can follow-up
on the assert at the ticket you made
(http://tracker.ceph.com/issues/21006).
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux