> Op 18 mei 2016 om 7:54 schreef Blair Bethwaite <blair.bethwaite@xxxxxxxxx>: > > > Hi all, > > What are the densest node configs out there, and what are your > experiences with them and tuning required to make them work? If we can > gather enough info here then I'll volunteer to propose some upstream > docs covering this. > > At Monash we currently have some 32-OSD nodes (running RHEL7), though > 8 of those OSDs are not storing or doing much yet (in a quiet EC'd RGW > pool), the other 24 OSDs are serving RBD and at perhaps 65% full on > average - these are 4TB drives. > I worked on a 256 OSD per node cluster (~2500 OSDs in total) and that didn't work out as hoped. I got into this project when the hardware was already ordered, it wouldn't have been my choice. > Aside from the already documented pid_max increases that are typically > necessary just to start all OSDs, we've also had to up > nf_conntrack_max. We've hit issues (twice now) that seem (have not Why enable connection tracking at all? It only slows down Ceph traffic. > figured out exactly how to confirm this yet) to be related to kernel > dentry slab cache exhaustion - symptoms were a major slow down in > performance and slow requests all over the place on writes, watching > OSD iostat would show a single drive hitting 90+% util for ~15s with a > bunch of small reads and no writes. These issues were worked around by > tuning up filestore split and merge thresholds, though if we'd known > about this earlier we'd probably have just bumped up the default > object size so that we simply had fewer objects (and/or rounded up the > PG count to the next power of 2). We also set vfs_cache_pressure to 1, > though this didn't really seem to do much at the time. I've also seen > recommendations about setting min_free_kbytes to something higher > (currently 90112 on our hardware) but have not verified this. > I eventually ended up doing NUMA pinning of OSDs and increasing pid_max, but that were most of the values. The network didn't really need so much attention to make this work. Wido > -- > Cheers, > ~Blairo > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com