From: Jason A. Donenfeld > Sent: 21 April 2017 22:15 > While this may appear as a humdrum one line change, it's actually quite > important. An sk_buff stores data in three places: > > 1. A linear chunk of allocated memory in skb->data. This is the easiest > one to work with, but it precludes using scatterdata since the memory > must be linear. > 2. The array skb_shinfo(skb)->frags, which is of maximum length > MAX_SKB_FRAGS. This is nice for scattergather, since these fragments > can point to different pages. > 3. skb_shinfo(skb)->frag_list, which is a pointer to another sk_buff, > which in turn can have data in either (1) or (2). > > The first two are rather easy to deal with, since they're of a fixed > maximum length, while the third one is not, since there can be > potentially limitless chains of fragments. Fortunately dealing with > frag_list is opt-in for drivers, so drivers don't actually have to deal > with this mess. For whatever reason, macsec decided it wanted pain, and > so it explicitly specified NETIF_F_FRAGLIST. > > Because dealing with (1), (2), and (3) is insane, most users of sk_buff > doing any sort of crypto or paging operation calls a convenient function > called skb_to_sgvec (which happens to be recursive if (3) is in use!). > This takes a sk_buff as input, and writes into its output pointer an > array of scattergather list items. Sometimes people like to declare a > fixed size scattergather list on the stack; othertimes people like to > allocate a fixed size scattergather list on the heap. However, if you're > doing it in a fixed-size fashion, you really shouldn't be using > NETIF_F_FRAGLIST too (unless you're also ensuring the sk_buff and its > frag_list children arent't shared and then you check the number of > fragments in total required.) > > Macsec specifically does this: > > size += sizeof(struct scatterlist) * (MAX_SKB_FRAGS + 1); > tmp = kmalloc(size, GFP_ATOMIC); > *sg = (struct scatterlist *)(tmp + sg_offset); > ... > sg_init_table(sg, MAX_SKB_FRAGS + 1); > skb_to_sgvec(skb, sg, 0, skb->len); > > Specifying MAX_SKB_FRAGS + 1 is the right answer usually, but not if you're > using NETIF_F_FRAGLIST, in which case the call to skb_to_sgvec will > overflow the heap, and disaster ensues. ... Shouldn't skb_to_sgvec() be checking the number of fragments against the size of the sg list? The callers would then all need auditing to allow for failure. David