Re: Seperate metadata pool in 3x MDS node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Each rack works on different trees or is everything parallelized ?
The meta pools would be distributed over racks 1,2,4,5 ?
If it is distributed, even if the addressed MDS is on the same switch as
the client, you will always have this MDS which will consult/write (nvme)
OSDs on the other racks (among 1,2,4,5).

In any case, the exercise is interesting.



Le sam. 24 févr. 2024 à 19:56, Özkan Göksu <ozkangksu@xxxxxxxxx> a écrit :

> Hello folks!
>
> I'm designing a new Ceph storage from scratch and I want to increase CephFS
> speed and decrease latency.
> Usually I always build (WAL+DB on NVME with Sas-Sata SSD's) and I deploy
> MDS and MON's on the same servers.
> This time a weird idea came to my mind and I think it has great potential
> and will perform better on paper with my limited knowledge.
>
> I have 5 racks and the 3nd "middle" rack is my storage and management rack.
>
> - At RACK-3 I'm gonna locate 8x 1u OSD server (Spec: 2x E5-2690V4, 256GB,
> 4x 25G, 2x 1.6TB PCI-E NVME "MZ-PLK3T20", 8x 4TB SATA SSD)
>
> - My Cephfs kernel clients are 40x GPU nodes located at RACK1,2,4,5
>
> With my current workflow, all the clients;
> 1- visit the rack data switch
> 2- jump to main VPC switch via 2x100G,
> 3- talk with MDS servers,
> 4- Go back to the client with the answer,
> 5- To access data follow the same HOP's and visit the OSD's everytime.
>
> If I deploy separate metadata pool by using 4x MDS server at top of
> RACK-1,2,4,5 (Spec: 2x E5-2690V4, 128GB, 2x 10G(Public), 2x 25G (cluster),
> 2x 960GB U.2 NVME "MZ-PLK3T20")
> Then all the clients will make the request directly in-rack 1 HOP away MDS
> servers and if the request is only metadata, then the MDS node doesn't need
> to redirect the request to OSD nodes.
> Also by locating MDS servers with seperated metadata pool across all the
> racks will reduce the high load on main VPC switch at RACK-3
>
> If I'm not missing anything then only Recovery workload will suffer with
> this topology.
>
> What do you think?
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux