Re: State of play for RDMA on Luminous

Florian Haas <florian@xxxxxxxxxxx> · Tue, 29 Aug 2017 09:01:03 +0200

Sorry, I worded my questions poorly in the last email, so I'm asking
for clarification here:

On Mon, Aug 28, 2017 at 6:04 PM, Haomai Wang <haomai@xxxxxxxx> wrote:
> On Mon, Aug 28, 2017 at 7:54 AM, Florian Haas <florian@xxxxxxxxxxx> wrote:
>> On Mon, Aug 28, 2017 at 4:21 PM, Haomai Wang <haomai@xxxxxxxx> wrote:
>>> On Wed, Aug 23, 2017 at 1:26 AM, Florian Haas <florian@xxxxxxxxxxx> wrote:
>>>> Hello everyone,
>>>>
>>>> I'm trying to get a handle on the current state of the async messenger's
>>>> RDMA transport in Luminous, and I've noticed that the information
>>>> available is a little bit sparse (I've found
>>>> https://community.mellanox.com/docs/DOC-2693 and
>>>> https://community.mellanox.com/docs/DOC-2721, which are a great start
>>>> but don't look very complete). So I'm kicking off this thread that might
>>>> hopefully bring interested parties and developers together.
>>>>
>>>> Could someone in the know please confirm that the following assumptions
>>>> of mine are accurate:
>>>>
>>>> - RDMA support for the async messenger is available in Luminous.
>>>
>>> to be precious, rdma in luminous is available but lack of memory
>>> control when under pressure. it would be ok to run for test purpose.
>>
>> OK, thanks! Assuming async+rdma will become fully supported some time
>> in the next release or two, are there plans to backport async+rdma
>> related features to Luminous? Or will users likely need to wait for
>> the next release to get a production-grade Ceph/RDMA stack?
>
> I think so

OK, so just to clarify:

(1) production RDMA support *will* be in the next LTS. Correct?

(2) Users should *not* expect production RDMA support in any Luminous
point release. Correct?

>>>> - You enable it globally by setting ms_type to "async+rdma", and by
>>>> setting appropriate values for the various ms_async_rdma* options (most
>>>> importantly, ms_async_rdma_device_name).
>>>>
>>>> - You can also set RDMA messaging just for the public or cluster
>>>> network, via ms_public_type and ms_cluster_type.
>>>>
>>>> - Users have to make a global async+rdma vs. async+posix decision on
>>>> either network. For example, if either ms_type or ms_public_type is
>>>> configured to async+rdma on cluster nodes, then a client configured with
>>>> ms_type = async+posix can't communicate.
>>>>
>>>> Based on those assumptions, I have the following questions:
>>>>
>>>> - What is the current state of RDMA support in kernel libceph? In other
>>>> words, is there currently a way to map RBDs, or mount CephFS, if a Ceph
>>>> cluster uses RDMA messaging?
>>>
>>> no planning on kernel side so far. rbd-nbd, cephfs-fuse should be supported now.
>>
>> Understood — are there plans to support async+rdma in the kernel at
>> all, or is there something in the kernel that precludes this?
>
> no.

Do you mean:

(1) There is a plan to support async+rdma in the kernel client eventually, or

(2) There are no plans to bring async+rdma support to the kernel
client, even though it *would* be possible to implement from a kernel
perspective, or

(3) There are no plans to bring async+rdma support to the kernel
client *because* something deep in the kernel prevents it?

Thanks again!

Cheers,
Florian
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com