On Tue, Mar 31, 2015 at 10:44:51PM +0300, koukou73gr wrote: > On 03/31/2015 09:23 PM, Sage Weil wrote: > > > >It's nothing specific to peering (or ceph). The symptom we've seen is > >just that byte stop passing across a TCP connection, usually when there is > >some largish messages being sent. The ping/heartbeat messages get through > >because they are small and we disable nagle so they never end up in large > >frames. > > Is there any special route one should take in order to transition a > live cluster to use jumbo frames and avoid such pitfalls with OSD > peering? 1. Configure entire switch infrastructure for jumbo frames. 2. Enable config versioning of switch infrastructure configurations 3. Bonus points: Monitor config changes of switch infrastructure 4. Run ping test using e.g. fping from each node to every other node, with large frames. 5. Bonus points: Setup such a test in some monitor infrastructure. 6. Once you trust the config (and monitoring), up all the nodes MTU to jumbo size, simultaneously. This is the critical step and perhaps it could be further perfected. Ideally you would like an atomic MTU-upgrade command on the entire cluster. /M
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com