On 04/27/2018 01:28 AM, Marcelo Ricardo Leitner wrote: > On Fri, Apr 27, 2018 at 01:14:56AM +0300, Oleg Babin wrote: >> Hi Marcelo, >> >> On 04/24/2018 12:33 AM, Marcelo Ricardo Leitner wrote: >>> Hi, >>> >>> On Mon, Apr 23, 2018 at 09:41:04PM +0300, Oleg Babin wrote: >>>> Each SCTP association can have up to 65535 input and output streams. >>>> For each stream type an array of sctp_stream_in or sctp_stream_out >>>> structures is allocated using kmalloc_array() function. This function >>>> allocates physically contiguous memory regions, so this can lead >>>> to allocation of memory regions of very high order, i.e.: >>>> >>>> sizeof(struct sctp_stream_out) == 24, >>>> ((65535 * 24) / 4096) == 383 memory pages (4096 byte per page), >>>> which means 9th memory order. >>>> >>>> This can lead to a memory allocation failures on the systems >>>> under a memory stress. >>> >>> Did you do performance tests while actually using these 65k streams >>> and with 256 (so it gets 2 pages)? >>> >>> This will introduce another deref on each access to an element, but >>> I'm not expecting any impact due to it. >>> >> >> No, I didn't do such tests. Could you please tell me what methodology >> do you usually use to measure performance properly? >> >> I'm trying to do measurements with iperf3 on unmodified kernel and get >> very strange results like this: > ... > > I've been trying to fight this fluctuation for some time now but > couldn't really fix it yet. One thing that usually helps (quite a lot) > is increasing the socket buffer sizes and/or using smaller messages, > so there is more cushion in the buffers. > > What I have seen in my tests is that when it floats like this, is > because socket buffers floats between 0 and full and don't get into a > steady state. I believe this is because of socket buffer size is used > for limiting the amount of memory used by the socket, instead of being > the amount of payload that the buffer can hold. This causes some > discrepancy, especially because in SCTP we don't defrag the buffer (as > TCP does, it's the collapse operation), and the announced rwnd may > turn up being a lie in the end, which triggers rx drops, then tx cwnd > reduction, and so on. SCTP min_rto of 1s also doesn't help much on > this situation. > > On netperf, you may use -S 200000,200000 -s 200000,200000. That should > help it. > Thank you very much! I'll try this and get back with results later. -- Best regards, Oleg -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html