[Ceph-ansible] EXT: Re: EXT: Re: osd-directory scenario is used by us

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I should have chimed in here earlier, but I think adding support for this would be potentially beneficial for lots of use cases (not just our current big-data workload).
We've tested this type of setup with bcache and lvmcache (dmcache), with both block and object workloads, and decided to settle on lvmcache due to better support and tooling and slightly better performance.  Ideally, support would be added for a generic raw block device so that ceph-disk and ceph-ansible does not try to create partitions and just uses the entire device.  This then can be used with LVM, bcache, Intel CAS, <insert-your-favorite-caching-tech-here>...

The way we would use it with lvmcache would be, run our own Ansible role beforehand to prepare the physical disks with our lvmcache PVs, VG, LVs and hidden cached LVs, and then have ceph-ansible run something like this:

ceph-disk prepare <vg/lv-data> <vg/lv-journal>

where "vg/lv-data" is a cached device that has some small NVMe cache storage backed by a large spinning disk, and "vg/lv-journal" where the entire LV is on the NVMe device.

We currently do this with the "osd directory" ceph-ansible scenario where our lvm Ansible role slices up the disks, creates a bunch of logical volumes, and formats and mounts XFS (prior to running ceph-ansible).  The down side is of course that we're forced to use file-based journals.  Even with the overhead of file-based journals the performance improvements vs. just normal NVMe journals is significant for our workloads.  The caching layer is smart enough to promote the journals into the faster storage tier, and deep scrubs do not get promoted because of the large, sequential IO requests that are automatically detected.

I don't know if there might be potential support issues with making this generic, so at the very least it would be great if support for just LVM was added.  I can share my Ansible role for building out the lvmcache devices if anyone is interested or if it helps to understand our setup.

Thanks for the interest in our use case!
Anton Thaker
Walmart ✻



From: Ceph-ansible <ceph-ansible-bounces@xxxxxxxxxxxxxx> on behalf of Sebastien Han <shan@xxxxxxxxxx>
Sent: Friday, May 12, 2017 10:32 AM
To: Warren Wang - ISD
Cc: ceph-ansible@xxxxxxxxxxxxxx; ceph-devel; ceph-users
Subject: EXT: Re: [Ceph-ansible] EXT: Re: EXT: Re: osd-directory scenario is used by us
    
So if we were to support LVM device as an OSD that would be enough for
you? (support in ceph-disk).

On Mon, May 8, 2017 at 5:57 AM, Warren Wang - ISD
<Warren.Wang@xxxxxxxxxxx> wrote:
> You might find additional responses in ceph-users. Added.
>
> A little extra background here. If Ceph directly supported LVM devices as OSDs, we probably wouldn’t have to do what we’re doing now.  We don’t know of a way to be able to use LVM cache device as an OSD without this type of config. This is primarily to support  big data workloads that use object storage as the only backing storage. So the type of IO that we see is highly irregular, compared to most object storage workloads. Shameless plug, my big data colleagues will be presenting on this topic next week at the OpenStack  summit.
>
>  https://www.openstack.org/summit/boston-2017/summit-schedule/events/18432/introducing-swifta-a-performant-hadoop-file-system-driver-for-openstack-swift
>
> Sebastien, even with Bluestore, we’re expecting to use LVM cached devices for the bulk of object storage, with a dedicated NVMe/SSD partition for RocksDB. I don’t know if that matters at all with regards to the OSD directory discussion. We really haven’t  done anything other than a basic Bluestore test on systems where we had not setup LVM cache devices.
>
> Warren Wang
> Walmart ✻
>
> On 5/3/17, 6:17 PM, "Ceph-ansible on behalf of Gregory Meno" <ceph-ansible-bounces@xxxxxxxxxxxxxx on behalf of gmeno@xxxxxxxxxx> wrote:
>
>     Haven't seen any comments in a week. I'm going to cross-post this to ceph-devel
>
>     Dear ceph-devel in an effort to simplify ceph-ansible I removed the
>     code that sets up directory backed OSDs. We found our that it was
>     being used in the following way.
>
>     I would like to hear thoughts about this approach pro and con.
>
>     cheers,
>     G
>
>     On Tue, Apr 25, 2017 at 2:12 PM, Michael Gugino
>     <Michael.Gugino@xxxxxxxxxxx> wrote:
>     > All,
>     >
>     >   Thank you for the responses and consideration.  What we are doing is
>     > creating lvm volumes, mounting them, and using the mounts as directories
>     > for ceph-ansible.  Our primary concern is the use of lvmcache.  We’re
>     > using faster drives for the cache and slower drives for the backing
>     > volumes.
>     >
>     >   We try to keep as few local patches as practical, and our initial
>     > rollout of lvmcache + ceph-ansible steered us towards osd_directory
>     > scenario.  Currently, ceph-ansible does not allow use to use lvm in the
>     > way that we desire, but we are looking into submitting a PR to go that
>     > direction (at some point).
>     >
>     >   As far as using the stable branches, I’m not entirely sure what our
>     > strategy going forward will be.  Currently we are maintaining ceph-ansible
>     > branches based on ceph releases, not ceph-ansible releases.
>     >
>     >
>     > Michael Gugino
>     > Cloud Powered
>     > (540) 846-0304 Mobile
>     >
>     > Walmart ✻
>     > Saving people money so they can live better.
>     >
>     >
>     >
>     >
>     >
>     > On 4/25/17, 4:51 PM, "Sebastien Han" <shan@xxxxxxxxxx> wrote:
>     >
>     >>One other argument to remove the osd directory scenario is BlueStore.
>     >>Luminous is around the corner and we strongly hope it'll be the
>     >>default object store.
>     >>
>     >>On Tue, Apr 25, 2017 at 7:40 PM, Gregory Meno <gmeno@xxxxxxxxxx> wrote:
>     >>> Michael,
>     >>>
>     >>> I am naturally interested in the specifics of your use-case and would
>     >>> love to hear more about it.
>     >>> I think the desire to remove this scenario from the stable-2.2 release
>     >>> is low considering what you just shared.
>     >>> Would it be fair to ask that sharing your setup be the justification
>     >>> for restoring this functionality?
>     >>> Are you using the stable released bits already? I recommend doing so.
>     >>>
>     >>> +Seb +Alfredo
>     >>>
>     >>> cheers,
>     >>> Gregory
>     >>>
>     >>> On Tue, Apr 25, 2017 at 10:08 AM, Michael Gugino
>     >>> <Michael.Gugino@xxxxxxxxxxx> wrote:
>     >>>> Ceph-ansible community,
>     >>>>
>     >>>>   I see that recently osd-directory scenario was removed from
>     >>>>deployment
>     >>>> options.  We use this option in production, I will be submitting a
>     >>>>patch
>     >>>> and a small fix to re-add that scenario.  We believe our use-case is
>     >>>> non-trivial, and we are hoping to share our setup with the community in
>     >>>> the near future once we get approval.
>     >>>>
>     >>>> Thank you
>     >>>>
>     >>>>
>     >>>> Michael Gugino
>     >>>> Cloud Powered
>     >>>> (540) 846-0304 Mobile
>     >>>>
>     >>>> Walmart ✻
>     >>>> Saving people money so they can live better.
>     >>>>
>     >>>>
>     >>>>
>     >>>>
>     >>>>
>     >>>> On 4/18/17, 3:41 PM, "Ceph-ansible on behalf of Sebastien Han"
>     >>>> <ceph-ansible-bounces@xxxxxxxxxxxxxx on behalf of shan@xxxxxxxxxx>
>     >>>>wrote:
>     >>>>
>     >>>>>Hi everyone,
>     >>>>>
>     >>>>>We are close from releasing the new ceph-ansible stable release.
>     >>>>>We are currently in a heavy QA phase where we are pushing new tags in
>     >>>>>the format of v2.2.x.
>     >>>>>The latest tag already points to stable-2.2 branch.
>     >>>>>
>     >>>>>Stay tuned, stable-2.2 is just around the corner.
>     >>>>>Thanks!
>     >>>>>
>     >>>>>--
>     >>>>>Cheers
>     >>>>>
>     >>>>>––––––
>     >>>>>Sébastien Han
>     >>>>>Principal Software Engineer, Storage Architect
>     >>>>>
>     >>>>>"Always give 100%. Unless you're giving blood."
>     >>>>>
>     >>>>>Mail: seb@xxxxxxxxxx
>     >>>>>Address: 11 bis, rue Roquépine - 75008 Paris
>     >>>>>_______________________________________________
>     >>>>>Ceph-ansible mailing list
>     >>>>>Ceph-ansible@xxxxxxxxxxxxxx
>     >>>>>http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com
>     >>>>
>     >>>> _______________________________________________
>     >>>> Ceph-ansible mailing list
>     >>>> Ceph-ansible@xxxxxxxxxxxxxx
>     >>>> http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com
>     >>
>     >>
>     >>
>     >>--
>     >>Cheers
>     >>
>     >>––––––
>     >>Sébastien Han
>     >>Principal Software Engineer, Storage Architect
>     >>
>     >>"Always give 100%. Unless you're giving blood."
>     >>
>     >>Mail: seb@xxxxxxxxxx
>     >>Address: 11 bis, rue Roquépine - 75008 Paris
>     >
>     _______________________________________________
>     Ceph-ansible mailing list
>     Ceph-ansible@xxxxxxxxxxxxxx
>     http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com


Ceph-ansible Info Page
lists.ceph.com
To see the collection of prior postings to the list, visit the Ceph-ansible Archives. Using Ceph-ansible: To post a message to all the list members ...

>
>



-- 
Cheers

––––––
Sébastien Han
Principal Software Engineer, Storage Architect

"Always give 100%. Unless you're giving blood."

Mail: seb@xxxxxxxxxx
Address: 11 bis, rue Roquépine - 75008 Paris
_______________________________________________
Ceph-ansible mailing list
Ceph-ansible@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com


Ceph-ansible Info Page
lists.ceph.com
To see the collection of prior postings to the list, visit the Ceph-ansible Archives. Using Ceph-ansible: To post a message to all the list members ...

    ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux