Re: Luminous upgrade with existing EC pools

John Spray <jspray@xxxxxxxxxx> · Mon, 22 Jan 2018 22:13:16 +0000

On Mon, Jan 22, 2018 at 9:23 PM, David Turner <drakonstein@xxxxxxxxx> wrote:
> I ran into a problem removing the cache tier.  I tried everything I could to
> get past it, but I ended up having to re-enable it.  I'm running on 12.2.2
> with all bluestore OSDs.
>
> I successfully set allow_ec_overwrites to true, I set the cache-mode to
> forward, I flushed/evicted the entire cache, and then went to remove-overlay
> on the data pool and received the error "Error EBUSY: pool 'cephfs_data' is
> in use by CephFS via its tier".  I made sure that no client had cephfs
> mounted, I even stopped the MDS daemons, and the same error came up every
> time.  I also tested setting the cache-mode to none (from forward) after I
> ensured that the cache was empty and it told me "set cache-mode for pool
> 'cephfs_cache' to none (WARNING: pool is still configured as read or write
> tier)" and still had the same error for removing the overlay.  I ultimately
> had to concede defeat and set the cache-mode back to writeback to get things
> working again.
>
> Does anyone have any ideas for how to remove this cache tier?  Having full
> write on reads is not something I really want to keep around if I can get
> rid of it.

Oops, this is a case that we didn't think about in the mon's logic for
checking whether it's okay to remove a tier.  I've opened a ticket:
http://tracker.ceph.com/issues/22754

You'll need to either wait for a patch (or hack out that buggy check
yourself from the OSDMonitor::_check_remove_tier function), or leave
those pools in place and add a second, new EC pool to your filesystem
and use layouts to move some files onto it.

John

> On Mon, Jan 22, 2018 at 7:25 AM David Turner <drakonstein@xxxxxxxxx> wrote:
>>
>> I've already migrated all osds to bluestore and changed my pools to use a
>> crush rule specifying them to use an HDD class (forced about half of my data
>> to move). This week I'm planning to add in some new SSDs to move the
>> metadata pool to.
>>
>> I have experience with adding and removing cache tiers without losing data
>> in the underlying pool. The documentation on this in the upgrade procedure
>> and from the EC documentation had me very leary. Seeing the information
>> about EC pools from the CephFS documentation helps me to feel much more
>> confident. Thank you.
>>
>> On Mon, Jan 22, 2018, 5:53 AM John Spray <jspray@xxxxxxxxxx> wrote:
>>>
>>> On Sat, Jan 20, 2018 at 6:26 PM, David Turner <drakonstein@xxxxxxxxx>
>>> wrote:
>>> > I am not able to find documentation for how to convert an existing
>>> > cephfs
>>> > filesystem to use allow_ec_overwrites. The documentation says that the
>>> > metadata pool needs to be replicated, but that the data pool can be EC.
>>> > But
>>> > it says, "For Cephfs, using an erasure coded pool means setting that
>>> > pool in
>>> > a file layout." Is that really necessary if your metadata pool is
>>> > replicated
>>> > and you have an existing EC pool for the data? Could I just enable ec
>>> > overwrites and start flushing/removing the cache tier and be on my way
>>> > to
>>> > just using an EC pool?
>>>
>>> That snipped in the RADOS docs is a bit misleading: you only need to
>>> use a file layout if you're adding an EC pool as an addition pool
>>> rather than using it during creation of a filesystem.
>>>
>>> The CephFS version of events is here:
>>>
>>> http://docs.ceph.com/docs/master/cephfs/createfs/#using-erasure-coded-pools-with-cephfs
>>>
>>> As for migrating from a cache tiered configuration, I haven't tried
>>> it, but there's nothing CephFS-specific about it.  If the underlying
>>> pool that's set as the cephfs data pool is EC and has
>>> allow_ec_overwrites then CephFS won't care -- but I'm personally not
>>> an expert on what knobs and buttons to use to migrate away from a
>>> cache tiered config.
>>>
>>> Do bear in mind that your OSDs need to be using bluestore (which may
>>> not be the case since you're talking about migrating an existing
>>> system?)
>>>
>>> John
>>>
>>> >
>>> > _______________________________________________
>>> > ceph-users mailing list
>>> > ceph-users@xxxxxxxxxxxxxx
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com