Re: RMDA Bug?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>   2) I'll confirm with my colleague that whether cluster network is really used in 14.2.4. We also hit similar problem these days even using TCP async messenger.
[Changcheng]:
1) The problem should be already sovled in 14.2.4. We hit the problem in 14.2.1
2) I'll try to verify your problem when I have time(I'm working on other
affairs). There should be no problem when unifying both public/cluster
network with RDMA device.


On 23:22 Wed 30 Oct, Liu, Changcheng wrote:
> I'm working on master branch and deploy two nodes cluster. Data is transferring over RDMA.
>       [admin@server0 ~]$ sudo ceph daemon osd.0 perf dump AsyncMessenger::RDMAWorker-1
>       {
>           "AsyncMessenger::RDMAWorker-1": {
>               "tx_no_mem": 0,
>               "tx_parital_mem": 0,
>               "tx_failed_post": 0,
>               "tx_chunks": 26966,
>               "tx_bytes": 52789637,
>               "rx_chunks": 26916,
>               "rx_bytes": 52812278,
>               "pending_sent_conns": 0
>           }
>       }
> 
> The only difference is that I don’t differentiate public/cluster network in my cluster.
> You can try to make all public/cluster network use RDMA.
> Note:
>   1) If both public/cluster use RDMA, we can’t differentiate them in different subnetwork. This is feature limited. I'm planning to solve it in future)
>   2) I'll confirm with my colleague that whether cluster network is really used in 14.2.4. We also hit similar problem these days even using TCP async messenger.
> 
> Below is my cluster's ceph configuration.
> I also attach the systemd patch used in my side.
>       [admin@server0 ~]$ cat /etc/ceph/ceph.conf 
>       [global]
>           cluster = ceph
>           fsid = 24280750-d4f7-4d4f-89e4-f95b8fab87ff
>           auth_cluster_required = cephx
>           auth_service_required = cephx
>           auth_client_required = cephx
>       
>           osd pool default size = 2
>           osd pool default min size = 2
>           osd pool default pg num = 64
>           osd pool default pgp num = 128
>       
>           osd pool default crush rule = 0
>           osd crush chooseleaf type = 1
>       
>           mon_allow_pool_delete=true
>           osd_pool_default_pg_autoscale_mode=off
>       
>           ms_type = async+rdma
>           ms_async_rdma_device_name = mlx5_0
>       
>           mon_initial_members = server0
>           mon_host = 172.16.1.4
>       
>       [mon.rdmarhel0]
>           host = server0
>           mon addr = 172.16.1.4
>       [admin@server0 ~]$
> 
> B.R.
> Changcheng
> 
> On 13:07 Wed 30 Oct, Mason-Williams, Gabryel (DLSLtd,RAL,LSCI) wrote:
> >     1. The current problem is that it still sending data over the ethernet
> >        instead of ib.
> >     2. [global]
> >        fsid=xxxx
> >        mon_initial_members = node1, node2, node3
> >        mon_host = xxx.xx.xxx.ab,xxx.xx.xxx.ac, xxx.xx.xxx.ad
> >        auth_cluster_required = cephx
> >        auth_service_required = cephx
> >        auth_client_required = cephx
> >        public_network = xxx.xx.xxx.0/24
> >        cluster_network = xx.xxx.0.0/16
> >        ms_cluster_type = async+rdma
> >        ms_type = async+rdma
> >        ms_public_type = async+posix
> >        [mgr]
> >        ms_type = async+posix
> >     3. The ceph cluster is deployed using ceph-deploy then once up all of
> >        the daemons are turned off the rdma cluster config is then sent
> >        around then once that is complete the daemons are turned back on.
> >        The ulimit is set to unlimited, LimitMEMLOCK=infinity is set on the
> >        ceph-disk@.service, ceph-mds@.service, ceph-mon@.service,
> >        ceph-osd@.service, ceph-radosgw@.service, aswell as
> >        PrivateDevices=no on ceph-mds@.service, ceph-mon@.service and
> >        ceph-radosgw@.service. The ethernet mtu is set to 1000
> >      __________________________________________________________________
> > 
> >    From: Liu, Changcheng <changcheng.liu@xxxxxxxxx>
> >    Sent: 30 October 2019 12:24
> >    To: Mason-Williams, Gabryel (DLSLtd,RAL,LSCI)
> >    <gabryel.mason-williams@xxxxxxxxxxxxx>
> >    Cc: dev@xxxxxxx <dev@xxxxxxx>
> >    Subject: Re: RMDA Bug?
> > 
> >    1. What's the problem do you hit when using RDMA in 14.2.4? Any log
> >    shows the error?
> >    2. What's your ceph.conf?
> >    3. How do you deploy the ceph cluster? RDMA need lock some memory. So,
> >    it needs change some system configuration to meet with this
> >    requirement?
> >    On 11:21 Wed 30 Oct, Gabryel Mason-Williams wrote:
> >    > Liu, Changcheng wrote:
> >    > > On 07:31 Mon 28 Oct, Mason-Williams, Gabryel (DLSLtd,RAL,LSCI)
> >    wrote:
> >    > > >     I am using ceph version 12.2.8
> >    > > >     (ae699615bac534ea496ee965ac6192cb7e0e07c0) luminous (stable).
> >    > > >
> >    > > >     I have not checked the master branch do you think this is an
> >    issue in
> >    > > >     luminous that has been removed in later versions?      I
> >    haven't hit problem
> >    > > on master branch. Ceph/RDMA changed a lot
> >    > >       from luminous to master branch.
> >    > >
> >    > >       Is below configuration really needed in luminous/ceph.conf?
> >    > > >     ms_async_rdma_local_gid = xxxx          On master branch,
> >    this
> >    > > parameter is not needed at all.
> >    > > B.R.
> >    > > Changcheng
> >    > > >
> >    __________________________________________________________________
> >    >
> >    > Thanks, the issue of the OSD's falling over seems to have gone away
> >    updating to Nautilus 14.2.4. However, I am still unable to get it to
> >    properly communicate over RDMA even with removing
> >    ms_async_rdma_local_gid.
> >    > _______________________________________________
> >    > Dev mailing list -- dev@xxxxxxx
> >    > To unsubscribe send an email to dev-leave@xxxxxxx
> > 
> > 
> >    --
> > 
> >    This e-mail and any attachments may contain confidential, copyright and
> >    or privileged material, and are for the use of the intended addressee
> >    only. If you are not the intended addressee or an authorised recipient
> >    of the addressee please notify us of receipt by returning the e-mail
> >    and do not use, copy, retain, distribute or disclose the information in
> >    or attached to the e-mail.
> >    Any opinions expressed within this e-mail are those of the individual
> >    and not necessarily of Diamond Light Source Ltd.
> >    Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> >    attachments are free from viruses and we cannot accept liability for
> >    any damage which you may sustain as a result of software viruses which
> >    may be transmitted in or with the message.
> >    Diamond Light Source Limited (company no. 4375679). Registered in
> >    England and Wales with its registered office at Diamond House, Harwell
> >    Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United
> >    Kingdom
> 
> > _______________________________________________
> > Dev mailing list -- dev@xxxxxxx
> > To unsubscribe send an email to dev-leave@xxxxxxx
> 

> From 40fa0d7096364b410e8242c46967029fb949876a Mon Sep 17 00:00:00 2001
> From: Changcheng Liu <changcheng.liu@xxxxxxxxxx>
> Date: Tue, 23 Jul 2019 18:50:57 +0800
> Subject: [PATCH] rdma systemd: grant access to /dev and unlimit mem
> 
> Signed-off-by: Changcheng Liu <changcheng.liu@xxxxxxxxxx>
> 
> diff --git a/systemd/ceph-fuse@xxxxxxxxxxx b/systemd/ceph-fuse@xxxxxxxxxxx
> index d603042b12..ff2e9072f6 100644
> --- a/systemd/ceph-fuse@xxxxxxxxxxx
> +++ b/systemd/ceph-fuse@xxxxxxxxxxx
> @@ -12,6 +12,7 @@ ExecStart=/usr/bin/ceph-fuse -f --cluster ${CLUSTER} %I
>  LockPersonality=true
>  MemoryDenyWriteExecute=true
>  NoNewPrivileges=true
> +LimitMEMLOCK=infinity
>  # ceph-fuse requires access to /dev fuse device
>  PrivateDevices=no
>  ProtectControlGroups=true
> diff --git a/systemd/ceph-mds@xxxxxxxxxxx b/systemd/ceph-mds@xxxxxxxxxxx
> index 39a2e63105..0e58dfeeea 100644
> --- a/systemd/ceph-mds@xxxxxxxxxxx
> +++ b/systemd/ceph-mds@xxxxxxxxxxx
> @@ -14,7 +14,8 @@ ExecReload=/bin/kill -HUP $MAINPID
>  LockPersonality=true
>  MemoryDenyWriteExecute=true
>  NoNewPrivileges=true
> -PrivateDevices=yes
> +LimitMEMLOCK=infinity
> +PrivateDevices=no
>  ProtectControlGroups=true
>  ProtectHome=true
>  ProtectKernelModules=true
> diff --git a/systemd/ceph-mgr@xxxxxxxxxxx b/systemd/ceph-mgr@xxxxxxxxxxx
> index c98f6378b9..682c7ecef3 100644
> --- a/systemd/ceph-mgr@xxxxxxxxxxx
> +++ b/systemd/ceph-mgr@xxxxxxxxxxx
> @@ -18,7 +18,8 @@ LockPersonality=true
>  MemoryDenyWriteExecute=false
>  
>  NoNewPrivileges=true
> -PrivateDevices=yes
> +LimitMEMLOCK=infinity
> +PrivateDevices=no
>  ProtectControlGroups=true
>  ProtectHome=true
>  ProtectKernelModules=true
> diff --git a/systemd/ceph-mon@xxxxxxxxxxx b/systemd/ceph-mon@xxxxxxxxxxx
> index c95fcabb26..51854fad96 100644
> --- a/systemd/ceph-mon@xxxxxxxxxxx
> +++ b/systemd/ceph-mon@xxxxxxxxxxx
> @@ -21,7 +21,8 @@ LockPersonality=true
>  MemoryDenyWriteExecute=true
>  # Need NewPrivileges via `sudo smartctl`
>  NoNewPrivileges=false
> -PrivateDevices=yes
> +LimitMEMLOCK=infinity
> +PrivateDevices=no
>  ProtectControlGroups=true
>  ProtectHome=true
>  ProtectKernelModules=true
> diff --git a/systemd/ceph-osd@xxxxxxxxxxx b/systemd/ceph-osd@xxxxxxxxxxx
> index 1b5c9c82b8..06c20d7c83 100644
> --- a/systemd/ceph-osd@xxxxxxxxxxx
> +++ b/systemd/ceph-osd@xxxxxxxxxxx
> @@ -16,6 +16,8 @@ LockPersonality=true
>  MemoryDenyWriteExecute=true
>  # Need NewPrivileges via `sudo smartctl`
>  NoNewPrivileges=false
> +LimitMEMLOCK=infinity
> +PrivateDevices=no
>  ProtectControlGroups=true
>  ProtectHome=true
>  ProtectKernelModules=true
> diff --git a/systemd/ceph-radosgw@xxxxxxxxxxx b/systemd/ceph-radosgw@xxxxxxxxxxx
> index 7e3ddf6c04..fe1a6b9159 100644
> --- a/systemd/ceph-radosgw@xxxxxxxxxxx
> +++ b/systemd/ceph-radosgw@xxxxxxxxxxx
> @@ -13,7 +13,8 @@ ExecStart=/usr/bin/radosgw -f --cluster ${CLUSTER} --name client.%i --setuser ce
>  LockPersonality=true
>  MemoryDenyWriteExecute=true
>  NoNewPrivileges=true
> -PrivateDevices=yes
> +LimitMEMLOCK=infinity
> +PrivateDevices=no
>  ProtectControlGroups=true
>  ProtectHome=true
>  ProtectKernelModules=true
> diff --git a/systemd/ceph-volume@.service b/systemd/ceph-volume@.service
> index c21002cecb..e2d1f67b85 100644
> --- a/systemd/ceph-volume@.service
> +++ b/systemd/ceph-volume@.service
> @@ -9,6 +9,7 @@ KillMode=none
>  Environment=CEPH_VOLUME_TIMEOUT=10000
>  ExecStart=/bin/sh -c 'timeout $CEPH_VOLUME_TIMEOUT /usr/sbin/ceph-volume-systemd %i'
>  TimeoutSec=0
> +LimitMEMLOCK=infinity
>  
>  [Install]
>  WantedBy=multi-user.target
> -- 
> 2.17.1
> 

> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx




[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux