Throttle.cc: 194: FAILED assert(c >= 0) on snap rm or create in Ceph 0.87

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, I'm writing about a problem I'm seeing in a Ceph 0.87 cluster
where rbd snap create, rm, etc. are succeeding, but aborting with a
non-zero return code because the notify call at the very end of the
function (https://github.com/ceph/ceph/blob/v0.87/src/librbd/internal.cc#L468)
 is hitting an assertion failure (Throttle.cc: 194: FAILED assert(c >=
0)).

I did a bit of digging, and found that c is calculated in
calc_op_budget (https://github.com/ceph/ceph/blob/v0.87/src/osdc/Objecter.cc#L2453-L2471),
which is called in Objecter::_take_op_budget
(https://github.com/ceph/ceph/blob/v0.87/src/osdc/Objecter.h#L1597-L1608),
but could hypothetically be called again in Objecter::_throttle_op
(https://github.com/ceph/ceph/blob/v0.87/src/osdc/Objecter.cc#L2473-L2491),
if the first calculation returned 0. From diving into the rd.notify
call in IoCtxImpl.notify
(https://github.com/ceph/ceph/blob/v0.87/src/librados/IoCtxImpl.cc#L1117),
I can see that the call adds an op of type CEPH_OSD_OP_NOTIFY
(https://github.com/ceph/ceph/blob/v0.87/src/osdc/Objecter.h#L865),
which is defined at
https://github.com/ceph/ceph/blob/v0.87/src/include/rados.h#L185. From
that, we know that it's the code path at
https://github.com/ceph/ceph/blob/v0.87/src/osdc/Objecter.cc#L2463-L2464
that will be taken while calculating the budget, but from there I
can't tell where or why there would be extents set on a notify
operation. I'm not familiar with the Ceph codebase, so that's the
point where I figured I should ask for some advice about this from
someone who actually understands this stuff.

I also noticed the possibly related issue #9592
(http://tracker.ceph.com/issues/9592), but I'm not totally sure if
it's the same issue, it looks like a pretty different reproduction
process.

I'm not expecting any bugfixes for such an old version of Ceph, but
I'd appreciate help just understanding what's different with this
particular volume and how to clean it up by hand, and in the unlikely
event that this is a problem in the current development version of
Ceph, perhaps this can be considered a bug report.



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux