Thanks for the thinking. By 'traffic' I mean: when a user space rbd write has as a destination three replica osds in the same chassis, does the whole write get shipped out to the mon and then back, or just the write metadata to the mon, with the actual write data content not having to cross a physical ethernet cable but directly to the chassis-local osds via the 'virtual' internal switch? I thought when I read the layout of how ceph works only the control traffic goes to the mons, the data directly from the generator to the osds. Did I get that wrong? All the 'usual suspects' like lossy ethernets and miswirings, etc. have been checked. It's actually painful to sit and wait while 'update-initramfs' can take over a minute when the vm is chassis-local to the osds getting the write info. On 6/29/20 9:55 PM, Anthony D'Atri wrote: > What does “traffic” mean? Reads? Writes will have to hit the net regardless of any machinations. > >> On Jun 29, 2020, at 7:31 PM, Harry G. Coin <hgcoin@xxxxxxxxx> wrote: >> >> I need exactly what ceph is for a whole lot of work, that work just >> doesn't represent a large fraction of the total local traffic. Ceph is >> the right choice. Plainly ceph has tremendous support for replication >> within a chassis, among chassis and among racks. I just need >> intra-chassis traffic to not hit the net much. Seems not such an >> unreasonable thing given the intra-chassis crush rules and all. After >> all.. ceph's name wasn't chosen for where it can't go.... >> >>>>> On 6/29/20 1:57 PM, Marc Roos wrote: >>> I wonder if you should not have chosen a different product? Ceph is >>> meant to distribute data across nodes, racks, data centers etc. For a >>> nail use a hammer, for a screw use a screw driver. >>> -----Original Message----- >>> To: ceph-users@xxxxxxx >>> Subject: *****SPAM***** layout help: need chassis local io >>> to minimize net links >>> Hi >>> I have a few servers each with 6 or more disks, with a storage workload >>> that's around 80% done entirely within each server. From a >>> work-to-be-done perspective there's no need for 80% of the load to >>> traverse network interfaces, the rest needs what ceph is all about. So >>> I cooked up a set of crush maps and pools, one map/pool for each server >>> and one map/pool for the whole. Skipping the long story, the >>> performance remains network link speed bound and has got to change. >>> "Chassis local" io is too slow. I even tried putting a mon within each >>> server. I'd like to avoid having to revert to some other HA >>> filesystem per server with ceph at the chassis layer if I can help >>> it. >>> Any notions that would allow 'chassis local' rbd traffic to avoid or >>> mostly avoid leaving the box? >>> Thanks! >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an >>> email to ceph-users-leave@xxxxxxx >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx