> -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of > Jason Dillaman > Sent: 25 February 2016 01:30 > To: Christian Balzer <chibi@xxxxxxx> > Cc: ceph-users@xxxxxxxx > Subject: Re: ceph hammer : rbd info/Status : operation not > supported (95) (EC+RBD tier pools) > > I'll speak to what I can answer off the top of my head. The most important > point is that this issue is only related to EC pool base tiers, not replicated > pools. > > > Hello Jason (Ceph devs et al), > > > > On Wed, 24 Feb 2016 13:15:34 -0500 (EST) Jason Dillaman wrote: > > > > > If you run "rados -p <cache pool> ls | grep "rbd_id.<yyy-disk1>" and > > > don't see that object, you are experiencing that issue [1]. > > > > > > You can attempt to work around this issue by running "rados -p > > > irfu-virt setomapval rbd_id.<yyy-disk1> dummy value" to > > > force-promote the object to the cache pool. I haven't tested / > > > verified that will alleviate the issue, though. > > > > > > [1] http://tracker.ceph.com/issues/14762 > > > > > > > This concerns me greatly, as I'm about to phase in a cache tier this > > weekend into a very busy, VERY mission critical Ceph cluster. > > That is on top of a replicated pool, Hammer. > > > > That issue and the related git blurb are less than crystal clear, so > > for my and everybody else's benefit could you elaborate a bit more on > this? > > > > 1. Does this only affect EC base pools? > > Correct -- this is only an issue because EC pools do not directly support > several operations required by RBD. Placing a replicated cache tier in front of > an EC pool was, in effect, a work-around to this limitation. > > > 2. Is this a regressions of sorts and when came it about? > > I have a hard time imagining people not running into this earlier, > > unless that problem is very hard to trigger. > > 3. One assumes that this isn't fixed in any released version of Ceph, > > correct? > > > > Robert, sorry for CC'ing you, but AFAICT your cluster is about the > > closest approximation in terms of busyness to mine here. > > And I a assume that you're neither using EC pools (since you need > > performance, not space) and haven't experienced this bug all? > > > > Also, would you consider the benefits of the recency fix (thanks for > > that) being worth risk of being an early adopter of 0.94.6? > > In other words, are you eating your own dog food already and 0.94.6 > > hasn't eaten your data babies yet? ^o^ > > Per the referenced email chain, it was potentially the recency fix that > exposed this issue for EC pools fronted by a cache tier. Just to add. It's possible this bug was present for a while, but the broken recency logic effectively always promoted blocks regardless. Once this was fixed and ceph could actually make a decision of whether a block needed to be promoted or not this bug surfaced. You can always set the recency to 0 (possibly 1) and have the same behaviour as before the recency fix to ensure that you won't hit this bug. > > > > > Regards, > > > > Christian > > -- > > Christian Balzer Network/Systems Engineer > > chibi@xxxxxxx Global OnLine Japan/Rakuten Communications > > http://www.gol.com/ > > > > -- > > Jason Dillaman > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com