On Fri, 2015-05-01 at 16:14 -0400, Eric B Munson wrote: > On Fri, 01 May 2015, Tom Herbert wrote: > > > On Fri, May 1, 2015 at 11:42 AM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: > > > On Fri, 2015-05-01 at 13:43 -0400, Eric B Munson wrote: > > >> In order to enable policy decisions in userspace, the data contained in > > >> the SYN packet would be useful for tracking or identifying connections. > > >> Only parts of this data are available to userspace after the hand shake > > >> is completed. This patch exposes a new setsockopt() option that will, > > >> when used with a listening socket, ask the kernel to cache the skb > > >> holding the SYN packet for retrieval later. The SYN skbs will not be > > >> saved while the kernel is in syn cookie mode. > > >> > > >> The same option will ask the kernel for the packet headers when used > > >> with getsockopt() with the socket returned from accept(). The cached > > >> packet will only be available for the first getsockopt() call, the skb > > >> is consumed after the requested data is copied to userspace. Subsequent > > >> calls will return -ENOENT. Because of this behavior, getsockopt() will > > >> return -E2BIG if the caller supplied a buffer that is too small to hold > > >> the skb header. > > >> > > >> Signed-off-by: Eric B Munson <emunson@xxxxxxxxxx> > > >> Cc: Alexey Kuznetsov <kuznet@xxxxxxxxxxxxx> > > >> Cc: James Morris <jmorris@xxxxxxxxx> > > >> Cc: Hideaki YOSHIFUJI <yoshfuji@xxxxxxxxxxxxxx> > > >> Cc: Patrick McHardy <kaber@xxxxxxxxx> > > >> Cc: netdev@xxxxxxxxxxxxxxx > > >> Cc: linux-api@xxxxxxxxxxxxxxx > > >> Cc: linux-kernel@xxxxxxxxxxxxxxx > > >> --- > > > > > > We have a similar patch here at Google, but we do not hold one skb and > > > dst per saved syn. That can be ~4KB for some drivers. > > > > > > Only a kmalloc() with the needed part (headers), usually less than 128 > > > bytes. We store the length in first byte of this allocation. > > > > > > This has a huge difference if you want to have ~4 million request socks. > > > > > +1 on kmalloc solution. I posted a similar patch a couple of years ago > > https://patchwork.ozlabs.org/patch/146034/. There was pushback on > > memory usage and this having to narrow of a use case. > > > > Tom > > > > I cached the skb largely to take advantage of the built in reference > counting and avoid having to manage allocating memory and ownership of > said memory. For V2, how about I keep the skb reference in the request > structure and kmalloc() a buffer, to be owned by the tcp sock structure, > when the new tcp socket is created? This would also simplify the > getsockopt() so that the data was available to all callers until the > socket is closed. Please do not keep a reference on skb. This has a too big cost. Have you read that we plan to have up to 4 or 10 million request socks ? skb also holds a dst. We can upstream our implementation (based on Tom prior patch), we have been using it more than 2 years with success. -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html