Re: snap_trimming + backfilling is inefficient with many purged_snaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 22, 2014 at 7:06 PM, Florian Haas <florian@xxxxxxxxxxx> wrote:
> On Sun, Sep 21, 2014 at 9:52 PM, Sage Weil <sweil@xxxxxxxxxx> wrote:
>> On Sun, 21 Sep 2014, Florian Haas wrote:
>>> So yes, I think your patch absolutely still has merit, as would any
>>> means of reducing the number of snapshots an OSD will trim in one go.
>>> As it is, the situation looks really really bad, specifically
>>> considering that RBD and RADOS are meant to be super rock solid, as
>>> opposed to say CephFS which is in an experimental state. And contrary
>>> to CephFS snapshots, I can't recall any documentation saying that RBD
>>> snapshots will break your system.
>>
>> Yeah, it sounds like a separate issue, and no, the limit is not
>> documented because it's definitely not the intended behavior. :)
>>
>> ...and I see you already have a log attached to #9503.  Will take a look.
>
> I've already updated that issue in Redmine, but for the list archives
> I should also add this here: Dan's patch for #9503, together with
> Sage's for #9487, makes the problem go away in an instant. I've
> already pointed out that I owe Dan dinner, and Sage, well I already
> owe Sage pretty much lifelong full board. :)

Looks like I was bit too eager: while the cluster is behaving nicely
with these patches while nothing happens to any OSDs, it does flag PGs
as incomplete when an OSD goes down. Once the mon osd down out
interval expires things seem to recover/backfill normally, but it's
still disturbing to see this in the interim.

I've updated http://tracker.ceph.com/issues/9503 with a pg query from
one of the affected PGs, within the mon osd down out interval, while
it was marked incomplete.

Dan or Sage, any ideas as to what might be causing this?

Cheers,
Florian
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux