Re: Setup Ceph over RDMA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi guys,

There is just ms_type = async+rdma in the document, but there are
options not mentioned. I get them using osd config show:
ceph config show-with-defaults osd.0 | grep rdma

ms_async_rdma_buffer_size 131072
ms_async_rdma_cm false
ms_async_rdma_device_name
ms_async_rdma_dscp 96
ms_async_rdma_enable_hugepage false
ms_async_rdma_gid_idx 0
ms_async_rdma_local_gid
ms_async_rdma_polling_us 1000
ms_async_rdma_port_num 1
ms_async_rdma_receive_buffers 32768
ms_async_rdma_receive_queue_len 4096
ms_async_rdma_roce_ver 1
ms_async_rdma_send_buffers 1024
ms_async_rdma_sl 3
ms_async_rdma_support_srq true
ms_async_rdma_type ib

When I checked Ceph github I found these options with_legacy: true.
https://github.com/ceph/ceph/blob/main/src/common/options/global.yaml.in

- name: ms_async_rdma_device_name
type: str
level: advanced
with_legacy: true
- name: ms_async_rdma_enable_hugepage
type: bool
level: advanced
default: false
with_legacy: true
- name: ms_async_rdma_buffer_size
type: size
level: advanced
default: 128_K
with_legacy: true
- name: ms_async_rdma_send_buffers
type: uint
level: advanced
default: 1_K
with_legacy: true

size of the receive buffer pool, 0 is unlimited
- name: ms_async_rdma_receive_buffers
type: uint
level: advanced
default: 32_K
with_legacy: true
max number of wr in srq
- name: ms_async_rdma_receive_queue_len
type: uint
level: advanced
default: 4_K
with_legacy: true
support srq
- name: ms_async_rdma_support_srq
type: bool
level: advanced
default: true
with_legacy: true
- name: ms_async_rdma_port_num
type: uint
level: advanced
default: 1
with_legacy: true
- name: ms_async_rdma_polling_us
type: uint
level: advanced
default: 1000
with_legacy: true
- name: ms_async_rdma_gid_idx
type: int
level: advanced
desc: use gid_idx to select GID for choosing RoCEv1 or RoCEv2
default: 0
with_legacy: true
GID format: "fe80:0000:0000:0000:7efe:90ff:fe72:6efe", no zero folding
- name: ms_async_rdma_local_gid
type: str
level: advanced
with_legacy: true
0=RoCEv1, 1=RoCEv2, 2=RoCEv1.5
- name: ms_async_rdma_roce_ver
type: int
level: advanced
default: 1
with_legacy: true
in RoCE, this means PCP
- name: ms_async_rdma_sl
type: int
level: advanced
default: 3
with_legacy: true
in RoCE, this means DSCP
- name: ms_async_rdma_dscp
type: int
level: advanced
default: 96
with_legacy: true
when there are enough accept failures, indicating there are
unrecoverable failures,
just do ceph_abort() . Here we make it configurable.
- name: ms_max_accept_failures
type: int
level: advanced
desc: The maximum number of consecutive failed accept() calls before considering
the daemon is misconfigured and abort it.
default: 4
with_legacy: true
rdma connection management
- name: ms_async_rdma_cm
type: bool
level: advanced
default: false
with_legacy: true
- name: ms_async_rdma_type
type: str
level: advanced
default: ib
with_legacy: true

It causes confusion and The RDMA setup needs more detail in the document.

Regards

On Mon, Apr 8, 2024 at 10:06 AM Vahideh Alinouri
<vahideh.alinouri@xxxxxxxxx> wrote:
>
> Hi guys,
>
> I need setup Ceph over RDMA, but I faced many issues!
> The info regarding my cluster:
> Ceph version is Reef
> Network cards are Broadcom RDMA.
> RDMA connection between OSD nodes are OK.
>
> I just found ms_type = async+rdma config in document and apply it using
> ceph config set global ms_type async+rdma
> After this action the cluster crashes. I tried to cluster back, and I did:
> Put ms_type async+posix in ceph.conf
> Restart all MON services
>
> The cluster is back, but I don't have any active mgr. All OSDs are down too.
> Is there any order to do for setting up Ceph over RDMA?
> Thanks
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux