Hello, On Wed, 2022-03-23 at 16:37 -0700, Jakub Kicinski wrote: > The comment about shinfo->dataref split is really unhelpful, > at least to me. Rewrite it and render it to skb documentation. > > Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx> > --- > Documentation/networking/index.rst | 1 + > Documentation/networking/skbuff.rst | 6 ++++++ > include/linux/skbuff.h | 33 +++++++++++++++++++---------- > 3 files changed, 29 insertions(+), 11 deletions(-) > > diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst > index ce017136ab05..1b3c45add20d 100644 > --- a/Documentation/networking/index.rst > +++ b/Documentation/networking/index.rst > @@ -96,6 +96,7 @@ Linux Networking Documentation > sctp > secid > seg6-sysctl > + skbuff > smc-sysctl > statistics > strparser > diff --git a/Documentation/networking/skbuff.rst b/Documentation/networking/skbuff.rst > index 7c6be64f486a..581e5561c362 100644 > --- a/Documentation/networking/skbuff.rst > +++ b/Documentation/networking/skbuff.rst > @@ -23,3 +23,9 @@ skb_clone() allows for fast duplication of skbs. None of the data buffers > get copied, but caller gets a new metadata struct (struct sk_buff). > &skb_shared_info.refcount indicates the number of skbs pointing at the same > packet data (i.e. clones). > + > +dataref and headerless skbs > +--------------------------- > + > +.. kernel-doc:: include/linux/skbuff.h > + :doc: dataref and headerless skbs > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 5431be4aa309..5b838350931c 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -691,16 +691,25 @@ struct skb_shared_info { > skb_frag_t frags[MAX_SKB_FRAGS]; > }; > > -/* We divide dataref into two halves. The higher 16 bits hold references > - * to the payload part of skb->data. The lower 16 bits hold references to > - * the entire skb->data. A clone of a headerless skb holds the length of > - * the header in skb->hdr_len. > +/** > + * DOC: dataref and headerless skbs > + * > + * Transport layers send out clones of data skbs they hold for retransmissions. > + * To allow lower layers of the stack to prepend their headers > + * we split &skb_shared_info.dataref into two halves. > + * The lower 16 bits count the overall number of references. > + * The higher 16 bits indicate number of data-only references. > + * skb_header_cloned() checks if skb is allowed to add / write the headers. Thank you very much for the IMHO much needed documentation! Please allow me to do some non-native-english-speaker biased comments;) The previous patch uses the form "payload data" instead of data-only, I think it would be clearer using consistently one or the other. I personally would go for "payload-data-only" (which is probably a good reason to pick a different option). > * > - * All users must obey the rule that the skb->data reference count must be > - * greater than or equal to the payload reference count. > + * The creator of the skb (e.g. TCP) marks its data-only skb as &sk_buff.nohdr > + * (via __skb_header_release()). Any clone created from marked skb will get > + * &sk_buff.hdr_len populated with the available headroom. > + * If it's the only clone in existence it's able to modify the headroom at will. I think it would be great if we explicitly list the expected sequence, e.g. <alloc skb> skb_reserve __skb_header_release skb_clone Thanks! Paolo