Hello Neil, Vlad, On 01/22/2014 03:18 PM, ext Vlad Yasevich wrote: > On 01/22/2014 07:30 AM, Neil Horman wrote: >> On Fri, Jan 17, 2014 at 08:01:24AM +0100, Matija Glavinic Pecotic wrote: >>> >>> Proposed solution simplifies whole algorithm having on mind > definition from rfc: >>> >>> o Receiver Window (rwnd): This gives the sender an indication of the > space >>> available in the receiver's inbound buffer. >>> >>> Core of the proposed solution is given with these lines: >>> >>> sctp_assoc_rwnd_account: >>> if ((asoc->base.sk->sk_rcvbuf - rx_count) > 0) >>> asoc->rwnd = (asoc->base.sk->sk_rcvbuf - rx_count) >> 1; >>> else >>> asoc->rwnd = 0; >>> >>> We advertise to sender (half of) actual space we have. Half is in the > braces >>> depending whether you would like to observe size of socket buffer as > SO_RECVBUF >>> or twice the amount, i.e. size is the one visible from userspace, > that is, >>> from kernelspace. >>> In this way sender is given with good approximation of our buffer space, >>> regardless of the buffer policy - we always advertise what we have. > Proposed >>> solution fixes described problems and removes necessity for rwnd > restoration >>> algorithm. Finally, as proposed solution is simplification, some > lines of code, >>> along with some bytes in struct sctp_association are saved. >>> >>> Signed-off-by: Matija Glavinic Pecotic > <matija.glavinic-pecotic.ext@xxxxxxx> >>> Reviewed-by: Alexander Sverdlin <alexander.sverdlin@xxxxxxx> >>> >> >> >> General comment - While this seems to make sense to me generally speaking, >> doesn't it currently violate section 6 of the RFC? >> >> >> A SCTP receiver MUST be able to receive a minimum of 1500 bytes in >> one SCTP packet. This means that a SCTP endpoint MUST NOT indicate >> less than 1500 bytes in its Initial a_rwnd sent in the INIT or INIT >> ACK. >> >> Since we set the initial rwnd value to the larger of sk->sk_rcvbuf/2 or >> SCTP_MIN_RWND (1500 bytes), won't we potentially advertize half that > amount? > > Not initially. Initial window will still be advertized properly. Once > we receive the first packet and consumed some space, we'll advertize > half of available receive buffer. It is perfectly OK to advertize a > window smaller the MIN_WINDOW in the middle of the transfer. I agree to this, although we might be in gray area here: Advertised Receiver Window Credit (a_rwnd): 32 bits (unsigned integer) This value represents the dedicated buffer space, in number of bytes, the sender of the INIT has reserved in association with this window. During the life of the association, this buffer space SHOULD NOT be lessened (i.e., dedicated buffers taken away from this association); however, an endpoint MAY change the value of a_rwnd it sends in SACK chunks. this might be considered as taking away buffer space, although I would agree with point below about doubling. This however opens another question which you should be aware of. This patch brakes regression, two TCs: test_timetolive and test_sctp_sendrcvmsg. This is simply due to 'honest' rwnd reporting. Both of these TCs share code in which initial rwnd is set very low, later socket buffer is increased but with counting on the fact that rwnd will stay as initially set. In TCs, this latter rwnd is fetched from the socket and used as value for the message size which in the end breaks it as message to be sent is too big. What is important difference to current implementation is that changes of SO_RECVBUF will also change a_rwnd. It is not a big problem to add code which will keep the idea, but bound rwnd to initially set rwnd, but we haven't found it mandated by rfc. Thanks, Matija >> It seems we need to double the minimum socket receive buffer here. > > Not here specifically, but yes. It is already broken and this patch > doesn't change current behavior. This is something SCTP may need to do > separately. > > -vlad > >> >> Neil >> >> -- To unsubscribe from this list: send the line "unsubscribe linux-sctp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html