Re: Ceph RDMA Update

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



@Haomai: Please check my reply.

On 01:52 Wed 27 Nov, <haomai@xxxxxxxx> wrote:
> Liu, Changcheng <changcheng.liu@xxxxxxxxx> 
> >
> > On 02:53 Tue 19 Nov, haomai@xxxxxxxx wrote:
> > > Liu, Changcheng <changcheng.liu@xxxxxxxxx>
> > [Changcheng]:
> >    1. Do we have plan to use RDMA-CM connection management by default for RDMA in Ceph?
> >       Currently, RDMA-CM connection has been integrated into Ceph code.
> >       However, it will only work when setting 'ms_async_rdma_cm=true' while the default value of ms_async_rdma_cm is false.
> >       It's really not good that we maintine to connection management method for RDMA in Ceph.
> >
> >       What's about changing the default connection management to RDMA-CM?
> If we have good test over rdma-cm, it should be ok.
[Changcheng]:
 Once using rdma-cm for connection management, it could both support
 RoCEv1/RoCEv2/iWARP, this could unify the Ceph RDMA configuration.

> > >
> > > >  1) Support multiple devices
> > > >  [Changcheng]:
> > > >     Do you mean seperate public & cluster network and use both RDMA on public & cluster network?
> > > >     Currently, Ceph could work under RDMA with below solution:
> > > >       a. Make no difference between public & cluster network, both use the same RDMA device port for RDMA messenger.
> > > >       OR
> > > >       b. Public network is based on TCP posix and cluster network is running on RDMA.
> > > >  2) Enable unified ceph.conf for all ceph nodes
> > > >  [Changcheng]:
> > > >     Do you mean that in some node, ceph need set different RDMA device port to be used?
> > >
> > > hmm, yes
[Changcheng]:
  To avoid "set different RDMA device port to be used", it's better to
  look for the RDMA device according to the RNIC IP address.
  What do you think of it?

> > [Changcheng]:
> >    2. If there's plan to let both public & cluster network run RDMA on seperate network, we must use RDMA-CM for connection management, right?
> not exactly, but if rdma-cm it will be easier to let code support
[Changcheng]:
 Yes, rdma-cm makes it easier to let code support it.

> > > It's a long story.....
> > [Changcheng]:
> >    3. Is this related with RDMA? Has it been implemented in Ceph?
> I think we should refer to crimson-ceph to support this
[Changcheng]:
 Thanks for your info.

> 
> > > it's mean register data buffer read from storage device
> > [Changcheng]:
> >    4. Do you mean that 1) create the RDMA Memory Region(MR) first 2) use the MR in bufferlist 3) post the bufferlist as work request in RDMA send queue to be sent directly without using tx_copy_chunk?
> yeap
[Changcheng]:
 This seems impossible. I don't know whether the bufferlist is only for
 message transaction. If we work in this direction, there could be lots
 of changes.

> > > > II. ToDo:
> > > >    1. Use RDMA Read/Write for better memory utilization
> > > >    [Changcheng]:
> > > >       Any plan to implement RDMA Read/Write? How to solve the compatiblity problem since the previous implementation is based on RC-Send/RC-Recv?
> > >
> > > Maybe it's not a good idea now
> > [Changcheng]:
> >    5. Is there any background that we don't use Read/Write semantics in Ceph RDMA implementation?
> from vendor's info, Read/Write is not welcomed.
[Changcheng]:
 OK. I don't have performance data about the difference
 between Read/Write & Send/Recv. Let's talk this later.
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux