Re: [Ceph-ansible] EXT: Re: EXT: Re: osd-directory scenario is used by us

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 12, 2017 at 5:56 PM, Anton Thaker <Anton.Thaker@xxxxxxxxxxx> wrote:
> I should have chimed in here earlier, but I think adding support for this would be potentially beneficial for lots of use cases (not just our current big-data workload).
> We've tested this type of setup with bcache and lvmcache (dmcache), with both block and object workloads, and decided to settle on lvmcache due to better support and tooling and slightly better performance.  Ideally, support would be added for a generic raw block device so that ceph-disk and ceph-ansible does not try to create partitions and just uses the entire device.  This then can be used with LVM, bcache, Intel CAS, <insert-your-favorite-caching-tech-here>...

You are right in that ceph-disk will insist in partitioning (and
labeling) things, so it will not work with lvm or anything similar to
a logical volume. Our current idea is to go ahead and support devices
as-is, but this is a bit more complicated (as you may be aware)
because systemd is tied to ceph-disk and they all rely on udev.

Would it be possible to know how did you handled the mounting of these
volumes? Was it editing fstab directly or some other way?

>
> The way we would use it with lvmcache would be, run our own Ansible role beforehand to prepare the physical disks with our lvmcache PVs, VG, LVs and hidden cached LVs, and then have ceph-ansible run something like this:
>
> ceph-disk prepare <vg/lv-data> <vg/lv-journal>
>
> where "vg/lv-data" is a cached device that has some small NVMe cache storage backed by a large spinning disk, and "vg/lv-journal" where the entire LV is on the NVMe device.

This is the path I want to go with and trying to gather information
before we go ahead implementing. My main concern is how to deal with
mounting while keeping systemd support. Knowing a bit more of your
setup will help
validate my ideas/concerns

>
> We currently do this with the "osd directory" ceph-ansible scenario where our lvm Ansible role slices up the disks, creates a bunch of logical volumes, and formats and mounts XFS (prior to running ceph-ansible).  The down side is of course that we're forced to use file-based journals.  Even with the overhead of file-based journals the performance improvements vs. just normal NVMe journals is significant for our workloads.  The caching layer is smart enough to promote the journals into the faster storage tier, and deep scrubs do not get promoted because of the large, sequential IO requests that are automatically detected.
>
> I don't know if there might be potential support issues with making this generic, so at the very least it would be great if support for just LVM was added.  I can share my Ansible role for building out the lvmcache devices if anyone is interested or if it helps to understand our setup.

That would be incredibly helpful!

>
> Thanks for the interest in our use case!
> Anton Thaker
> Walmart ✻
>
>
>
> From: Ceph-ansible <ceph-ansible-bounces@xxxxxxxxxxxxxx> on behalf of Sebastien Han <shan@xxxxxxxxxx>
> Sent: Friday, May 12, 2017 10:32 AM
> To: Warren Wang - ISD
> Cc: ceph-ansible@xxxxxxxxxxxxxx; ceph-devel; ceph-users
> Subject: EXT: Re: [Ceph-ansible] EXT: Re: EXT: Re: osd-directory scenario is used by us
>
> So if we were to support LVM device as an OSD that would be enough for
> you? (support in ceph-disk).
>
> On Mon, May 8, 2017 at 5:57 AM, Warren Wang - ISD
> <Warren.Wang@xxxxxxxxxxx> wrote:
>> You might find additional responses in ceph-users. Added.
>>
>> A little extra background here. If Ceph directly supported LVM devices as OSDs, we probably wouldn’t have to do what we’re doing now.  We don’t know of a way to be able to use LVM cache device as an OSD without this type of config. This is primarily to support  big data workloads that use object storage as the only backing storage. So the type of IO that we see is highly irregular, compared to most object storage workloads. Shameless plug, my big data colleagues will be presenting on this topic next week at the OpenStack  summit.
>>
>>  https://www.openstack.org/summit/boston-2017/summit-schedule/events/18432/introducing-swifta-a-performant-hadoop-file-system-driver-for-openstack-swift
>>
>> Sebastien, even with Bluestore, we’re expecting to use LVM cached devices for the bulk of object storage, with a dedicated NVMe/SSD partition for RocksDB. I don’t know if that matters at all with regards to the OSD directory discussion. We really haven’t  done anything other than a basic Bluestore test on systems where we had not setup LVM cache devices.
>>
>> Warren Wang
>> Walmart ✻
>>
>> On 5/3/17, 6:17 PM, "Ceph-ansible on behalf of Gregory Meno" <ceph-ansible-bounces@xxxxxxxxxxxxxx on behalf of gmeno@xxxxxxxxxx> wrote:
>>
>>     Haven't seen any comments in a week. I'm going to cross-post this to ceph-devel
>>
>>     Dear ceph-devel in an effort to simplify ceph-ansible I removed the
>>     code that sets up directory backed OSDs. We found our that it was
>>     being used in the following way.
>>
>>     I would like to hear thoughts about this approach pro and con.
>>
>>     cheers,
>>     G
>>
>>     On Tue, Apr 25, 2017 at 2:12 PM, Michael Gugino
>>     <Michael.Gugino@xxxxxxxxxxx> wrote:
>>     > All,
>>     >
>>     >   Thank you for the responses and consideration.  What we are doing is
>>     > creating lvm volumes, mounting them, and using the mounts as directories
>>     > for ceph-ansible.  Our primary concern is the use of lvmcache.  We’re
>>     > using faster drives for the cache and slower drives for the backing
>>     > volumes.
>>     >
>>     >   We try to keep as few local patches as practical, and our initial
>>     > rollout of lvmcache + ceph-ansible steered us towards osd_directory
>>     > scenario.  Currently, ceph-ansible does not allow use to use lvm in the
>>     > way that we desire, but we are looking into submitting a PR to go that
>>     > direction (at some point).
>>     >
>>     >   As far as using the stable branches, I’m not entirely sure what our
>>     > strategy going forward will be.  Currently we are maintaining ceph-ansible
>>     > branches based on ceph releases, not ceph-ansible releases.
>>     >
>>     >
>>     > Michael Gugino
>>     > Cloud Powered
>>     > (540) 846-0304 Mobile
>>     >
>>     > Walmart ✻
>>     > Saving people money so they can live better.
>>     >
>>     >
>>     >
>>     >
>>     >
>>     > On 4/25/17, 4:51 PM, "Sebastien Han" <shan@xxxxxxxxxx> wrote:
>>     >
>>     >>One other argument to remove the osd directory scenario is BlueStore.
>>     >>Luminous is around the corner and we strongly hope it'll be the
>>     >>default object store.
>>     >>
>>     >>On Tue, Apr 25, 2017 at 7:40 PM, Gregory Meno <gmeno@xxxxxxxxxx> wrote:
>>     >>> Michael,
>>     >>>
>>     >>> I am naturally interested in the specifics of your use-case and would
>>     >>> love to hear more about it.
>>     >>> I think the desire to remove this scenario from the stable-2.2 release
>>     >>> is low considering what you just shared.
>>     >>> Would it be fair to ask that sharing your setup be the justification
>>     >>> for restoring this functionality?
>>     >>> Are you using the stable released bits already? I recommend doing so.
>>     >>>
>>     >>> +Seb +Alfredo
>>     >>>
>>     >>> cheers,
>>     >>> Gregory
>>     >>>
>>     >>> On Tue, Apr 25, 2017 at 10:08 AM, Michael Gugino
>>     >>> <Michael.Gugino@xxxxxxxxxxx> wrote:
>>     >>>> Ceph-ansible community,
>>     >>>>
>>     >>>>   I see that recently osd-directory scenario was removed from
>>     >>>>deployment
>>     >>>> options.  We use this option in production, I will be submitting a
>>     >>>>patch
>>     >>>> and a small fix to re-add that scenario.  We believe our use-case is
>>     >>>> non-trivial, and we are hoping to share our setup with the community in
>>     >>>> the near future once we get approval.
>>     >>>>
>>     >>>> Thank you
>>     >>>>
>>     >>>>
>>     >>>> Michael Gugino
>>     >>>> Cloud Powered
>>     >>>> (540) 846-0304 Mobile
>>     >>>>
>>     >>>> Walmart ✻
>>     >>>> Saving people money so they can live better.
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>> On 4/18/17, 3:41 PM, "Ceph-ansible on behalf of Sebastien Han"
>>     >>>> <ceph-ansible-bounces@xxxxxxxxxxxxxx on behalf of shan@xxxxxxxxxx>
>>     >>>>wrote:
>>     >>>>
>>     >>>>>Hi everyone,
>>     >>>>>
>>     >>>>>We are close from releasing the new ceph-ansible stable release.
>>     >>>>>We are currently in a heavy QA phase where we are pushing new tags in
>>     >>>>>the format of v2.2.x.
>>     >>>>>The latest tag already points to stable-2.2 branch.
>>     >>>>>
>>     >>>>>Stay tuned, stable-2.2 is just around the corner.
>>     >>>>>Thanks!
>>     >>>>>
>>     >>>>>--
>>     >>>>>Cheers
>>     >>>>>
>>     >>>>>––––––
>>     >>>>>Sébastien Han
>>     >>>>>Principal Software Engineer, Storage Architect
>>     >>>>>
>>     >>>>>"Always give 100%. Unless you're giving blood."
>>     >>>>>
>>     >>>>>Mail: seb@xxxxxxxxxx
>>     >>>>>Address: 11 bis, rue Roquépine - 75008 Paris
>>     >>>>>_______________________________________________
>>     >>>>>Ceph-ansible mailing list
>>     >>>>>Ceph-ansible@xxxxxxxxxxxxxx
>>     >>>>>http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com
>>     >>>>
>>     >>>> _______________________________________________
>>     >>>> Ceph-ansible mailing list
>>     >>>> Ceph-ansible@xxxxxxxxxxxxxx
>>     >>>> http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com
>>     >>
>>     >>
>>     >>
>>     >>--
>>     >>Cheers
>>     >>
>>     >>––––––
>>     >>Sébastien Han
>>     >>Principal Software Engineer, Storage Architect
>>     >>
>>     >>"Always give 100%. Unless you're giving blood."
>>     >>
>>     >>Mail: seb@xxxxxxxxxx
>>     >>Address: 11 bis, rue Roquépine - 75008 Paris
>>     >
>>     _______________________________________________
>>     Ceph-ansible mailing list
>>     Ceph-ansible@xxxxxxxxxxxxxx
>>     http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com
>
>
> Ceph-ansible Info Page
> lists.ceph.com
> To see the collection of prior postings to the list, visit the Ceph-ansible Archives. Using Ceph-ansible: To post a message to all the list members ...
>
>>
>>
>
>
>
> --
> Cheers
>
> ––––––
> Sébastien Han
> Principal Software Engineer, Storage Architect
>
> "Always give 100%. Unless you're giving blood."
>
> Mail: seb@xxxxxxxxxx
> Address: 11 bis, rue Roquépine - 75008 Paris
> _______________________________________________
> Ceph-ansible mailing list
> Ceph-ansible@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com
>
>
> Ceph-ansible Info Page
> lists.ceph.com
> To see the collection of prior postings to the list, visit the Ceph-ansible Archives. Using Ceph-ansible: To post a message to all the list members ...
>
>
> _______________________________________________
> Ceph-ansible mailing list
> Ceph-ansible@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux