Re: Re layout help: need chassis local io to minimize net links

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've not been replying to the list, apologies.

> just the write metadata to the mon,  with the actual write data
content not having to cross a physical ethernet cable but directly to
the chassis-local osds via the 'virtual' internal switch?

This is my understanding as well, yes. I've not explored the ceph source
yet though.

On 2020-06-29 8:37 p.m., Harry G. Coin wrote:
>
> Jeff, thanks for the lead.  When a user space rbd write has as a
> destination three replica osds in the same chassis, does the whole
> write get shipped out to the mon and then back, or just the write
> metadata to the mon,  with the actual write data content not having to
> cross a physical ethernet cable but directly to the chassis-local osds
> via the 'virtual' internal switch?  I thought when I read the layout
> of how ceph works only the control traffic goes to the mons, the data
> directly from the generator to the osds.  Did I get that wrong?
>
>
> On 6/29/20 10:32 PM, Jeff W wrote:
>> You mentioned setting up pools per host but still hitting network
>> limits, did you try tcpdumping the NIC to see who's talking to who?
>> Perhaps something isn't configured the way you expect? That may help
>> you narrow down what is using the NIC as well, Mon or osd or what
>> not. If it's local, I would think that the NIC wouldn't be a
>> bottleneck and if it is a bottleneck I would suspect my own configs,
>> but that's just my 2c. 
>>
>> Off the top of my head im thinking it's the Mon, because even if you
>> setup multiple pools I can't think of a way to have multiple groups
>> of mons maintaining their own shards of consensus. Unless your
>> workload is largely read only, then .. I'm not sure what the
>> bottleneck would be. 
>>
>>
>> On Mon., Jun. 29, 2020, 7:32 p.m. Harry G. Coin, <hgcoin@xxxxxxxxx
>> <mailto:hgcoin@xxxxxxxxx>> wrote:
>>
>>     I need exactly what ceph is for a whole lot of work, that work just
>>     doesn't represent a large fraction of the total local traffic. 
>>     Ceph is
>>     the right choice.  Plainly ceph has tremendous support for
>>     replication
>>     within a chassis, among chassis and among racks.  I just need
>>     intra-chassis traffic to not hit the net much.   Seems not such an
>>     unreasonable thing given the intra-chassis crush rules and all. 
>>     After
>>     all.. ceph's name wasn't chosen for where it can't go....
>>
>>     On 6/29/20 1:57 PM, Marc Roos wrote:
>>     > I wonder if you should not have chosen a different product?
>>     Ceph is
>>     > meant to distribute data across nodes, racks, data centers etc.
>>     For a
>>     > nail use a hammer, for a screw use a screw driver.
>>     > 
>>     >
>>     > -----Original Message-----
>>     > To: ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx>
>>     > Subject: *****SPAM*****  layout help: need chassis
>>     local io
>>     > to minimize net links
>>     >
>>     > Hi
>>     >
>>     > I have a few servers each with 6 or more disks, with a storage
>>     workload
>>     > that's around 80% done entirely within each server.   From a
>>     > work-to-be-done perspective there's no need for 80% of the load to
>>     > traverse network interfaces, the rest needs what ceph is all
>>     about.   So
>>     > I cooked up a set of crush maps and pools, one map/pool for
>>     each server
>>     > and one map/pool for the whole.  Skipping the long story, the
>>     > performance remains network link speed bound and has got to
>>     change.
>>     > "Chassis local" io is too slow.   I even tried putting a mon
>>     within each
>>     > server.    I'd like to avoid having to revert to some other HA
>>     > filesystem per server with ceph at the chassis layer if I can help
>>     > it.   
>>     >
>>     > Any notions that would allow 'chassis local' rbd traffic to
>>     avoid or
>>     > mostly avoid leaving the box?
>>     >
>>     > Thanks!
>>     >
>>     >
>>     >
>>     >
>>     > _______________________________________________
>>     > ceph-users mailing list -- ceph-users@xxxxxxx
>>     <mailto:ceph-users@xxxxxxx> To unsubscribe send an
>>     > email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx>
>>     >
>>     >
>>     _______________________________________________
>>     ceph-users mailing list -- ceph-users@xxxxxxx
>>     <mailto:ceph-users@xxxxxxx>
>>     To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>     <mailto:ceph-users-leave@xxxxxxx>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux