I've not been replying to the list, apologies. > just the write metadata to the mon, with the actual write data content not having to cross a physical ethernet cable but directly to the chassis-local osds via the 'virtual' internal switch? This is my understanding as well, yes. I've not explored the ceph source yet though. On 2020-06-29 8:37 p.m., Harry G. Coin wrote: > > Jeff, thanks for the lead. When a user space rbd write has as a > destination three replica osds in the same chassis, does the whole > write get shipped out to the mon and then back, or just the write > metadata to the mon, with the actual write data content not having to > cross a physical ethernet cable but directly to the chassis-local osds > via the 'virtual' internal switch? I thought when I read the layout > of how ceph works only the control traffic goes to the mons, the data > directly from the generator to the osds. Did I get that wrong? > > > On 6/29/20 10:32 PM, Jeff W wrote: >> You mentioned setting up pools per host but still hitting network >> limits, did you try tcpdumping the NIC to see who's talking to who? >> Perhaps something isn't configured the way you expect? That may help >> you narrow down what is using the NIC as well, Mon or osd or what >> not. If it's local, I would think that the NIC wouldn't be a >> bottleneck and if it is a bottleneck I would suspect my own configs, >> but that's just my 2c. >> >> Off the top of my head im thinking it's the Mon, because even if you >> setup multiple pools I can't think of a way to have multiple groups >> of mons maintaining their own shards of consensus. Unless your >> workload is largely read only, then .. I'm not sure what the >> bottleneck would be. >> >> >> On Mon., Jun. 29, 2020, 7:32 p.m. Harry G. Coin, <hgcoin@xxxxxxxxx >> <mailto:hgcoin@xxxxxxxxx>> wrote: >> >> I need exactly what ceph is for a whole lot of work, that work just >> doesn't represent a large fraction of the total local traffic. >> Ceph is >> the right choice. Plainly ceph has tremendous support for >> replication >> within a chassis, among chassis and among racks. I just need >> intra-chassis traffic to not hit the net much. Seems not such an >> unreasonable thing given the intra-chassis crush rules and all. >> After >> all.. ceph's name wasn't chosen for where it can't go.... >> >> On 6/29/20 1:57 PM, Marc Roos wrote: >> > I wonder if you should not have chosen a different product? >> Ceph is >> > meant to distribute data across nodes, racks, data centers etc. >> For a >> > nail use a hammer, for a screw use a screw driver. >> > >> > >> > -----Original Message----- >> > To: ceph-users@xxxxxxx <mailto:ceph-users@xxxxxxx> >> > Subject: *****SPAM***** layout help: need chassis >> local io >> > to minimize net links >> > >> > Hi >> > >> > I have a few servers each with 6 or more disks, with a storage >> workload >> > that's around 80% done entirely within each server. From a >> > work-to-be-done perspective there's no need for 80% of the load to >> > traverse network interfaces, the rest needs what ceph is all >> about. So >> > I cooked up a set of crush maps and pools, one map/pool for >> each server >> > and one map/pool for the whole. Skipping the long story, the >> > performance remains network link speed bound and has got to >> change. >> > "Chassis local" io is too slow. I even tried putting a mon >> within each >> > server. I'd like to avoid having to revert to some other HA >> > filesystem per server with ceph at the chassis layer if I can help >> > it. >> > >> > Any notions that would allow 'chassis local' rbd traffic to >> avoid or >> > mostly avoid leaving the box? >> > >> > Thanks! >> > >> > >> > >> > >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users@xxxxxxx >> <mailto:ceph-users@xxxxxxx> To unsubscribe send an >> > email to ceph-users-leave@xxxxxxx <mailto:ceph-users-leave@xxxxxxx> >> > >> > >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> <mailto:ceph-users@xxxxxxx> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> <mailto:ceph-users-leave@xxxxxxx> >> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx