Re: Throttle.cc: 194: FAILED assert(c >= 0) on snap rm or create in Ceph 0.87

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[ Re-send to make it through the vger filters; sorry! ]

Hmm, yeah. The ticket and failure mode makes me wonder if something
has gotten so strange with this image that the notify's bufferlist
actually exceeded a reasonable size, but I don't really see a
mechanism for that.
What snapshots exist on the pool? Can you successfully examine it in
other ways with the rbd image manipulation tools?

On Fri, Oct 26, 2018 at 3:48 PM Simon Ruggier <simon@xxxxxxxxxxx> wrote:
>
> First of all, thanks for your reply.
>
> Yeah, this is happening within the process executing the rbd command.
> Sorry I didn't include the backtrace in my original email, I
> completely forgot after putting together the rest of it.
>
> I set "debug objecter = 20" in the local ceph config file on the
> system I ran these commands on, then ran rbd snap create, snap ls, and
> snap rm, so you could look at debug output from any of those
> three. I saved the entire session, anonymized all names in the output,
> and compressed it. See attached. If you need any other information,
> let me know and I'll collect it when I'm able to.
> On Fri, Oct 26, 2018 at 5:19 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> >
> > This is happening on the client side? Can you provide the full
> > backtrace and a log with "debug objecter = 20" turned on?
> >
> > On Sun, Oct 21, 2018 at 11:25 AM Simon Ruggier <simon@xxxxxxxxxxx> wrote:
> > >
> > > Hi, I'm writing about a problem I'm seeing in a Ceph 0.87 cluster
> > > where rbd snap create, rm, etc. are succeeding, but aborting with a
> > > non-zero return code because the notify call at the very end of the
> > > function (https://github.com/ceph/ceph/blob/v0.87/src/librbd/internal.cc#L468)
> > >  is hitting an assertion failure (Throttle.cc: 194: FAILED assert(c >=
> > > 0)).
> > >
> > > I did a bit of digging, and found that c is calculated in
> > > calc_op_budget (https://github.com/ceph/ceph/blob/v0.87/src/osdc/Objecter.cc#L2453-L2471),
> > > which is called in Objecter::_take_op_budget
> > > (https://github.com/ceph/ceph/blob/v0.87/src/osdc/Objecter.h#L1597-L1608),
> > > but could hypothetically be called again in Objecter::_throttle_op
> > > (https://github.com/ceph/ceph/blob/v0.87/src/osdc/Objecter.cc#L2473-L2491),
> > > if the first calculation returned 0. From diving into the rd.notify
> > > call in IoCtxImpl.notify
> > > (https://github.com/ceph/ceph/blob/v0.87/src/librados/IoCtxImpl.cc#L1117),
> > > I can see that the call adds an op of type CEPH_OSD_OP_NOTIFY
> > > (https://github.com/ceph/ceph/blob/v0.87/src/osdc/Objecter.h#L865),
> > > which is defined at
> > > https://github.com/ceph/ceph/blob/v0.87/src/include/rados.h#L185. From
> > > that, we know that it's the code path at
> > > https://github.com/ceph/ceph/blob/v0.87/src/osdc/Objecter.cc#L2463-L2464
> > > that will be taken while calculating the budget, but from there I
> > > can't tell where or why there would be extents set on a notify
> > > operation. I'm not familiar with the Ceph codebase, so that's the
> > > point where I figured I should ask for some advice about this from
> > > someone who actually understands this stuff.
> > >
> > > I also noticed the possibly related issue #9592
> > > (http://tracker.ceph.com/issues/9592), but I'm not totally sure if
> > > it's the same issue, it looks like a pretty different reproduction
> > > process.
> > >
> > > I'm not expecting any bugfixes for such an old version of Ceph, but
> > > I'd appreciate help just understanding what's different with this
> > > particular volume and how to clean it up by hand, and in the unlikely
> > > event that this is a problem in the current development version of
> > > Ceph, perhaps this can be considered a bug report.



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux