Re: [PATCH] netvm: check for page == NULL when propogating the skb->pfmemalloc flag

Mel Gorman <mgorman@xxxxxxx> · Tue, 14 Aug 2012 11:18:07 +0100

On Mon, Aug 13, 2012 at 11:56:48AM -0700, Jeremy Fitzhardinge wrote:
> On 08/13/2012 03:47 AM, Mel Gorman wrote:
> > Resending to correct Jeremy's address.
> >
> > On Wed, Aug 08, 2012 at 03:50:46PM -0700, David Miller wrote:
> >> From: Mel Gorman <mgorman@xxxxxxx>
> >> Date: Tue, 7 Aug 2012 09:55:55 +0100
> >>
> >>> Commit [c48a11c7: netvm: propagate page->pfmemalloc to skb] is responsible
> >>> for the following bug triggered by a xen network driver
> >>  ...
> >>> The problem is that the xenfront driver is passing a NULL page to
> >>> __skb_fill_page_desc() which was unexpected. This patch checks that
> >>> there is a page before dereferencing.
> >>>
> >>> Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> >>> Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
> >> That call to __skb_fill_page_desc() in xen-netfront.c looks completely bogus.
> >> It's the only driver passing NULL here.
> >>
> >> That whole song and dance figuring out what to do with the head
> >> fragment page, depending upon whether the length is greater than the
> >> RX_COPY_THRESHOLD, is completely unnecessary.
> >>
> >> Just use something like a call to __pskb_pull_tail(skb, len) and all
> >> that other crap around that area can simply be deleted.
> > I looked at this for a while but I did not see how __pskb_pull_tail()
> > could be used sensibly but I'm simily not familiar with writing network
> > device drivers or Xen.
> >
> > This messing with RX_COPY_THRESHOLD seems to be related to how the frontend
> > and backend communicate (maybe some fixed limitation of the xenbus). The
> > existing code looks like it is trying to take the fragments received and
> > pass them straight to the backend without copying by passing the fragments
> > to the backend without copying. I worry that if I try converting this to
> > __pskb_pull_tail() that it would either hit the limitation of xenbus or
> > introduce copying where it is not wanted.
> >
> > I'm going to have to punt this to Jeremy and the other Xen folk as I'm not
> > sure what the original intention was and I don't have a Xen setup anywhere
> > to test any patch. Jeremy, xen folk? 
> 
> It's been a while since I've looked at that stuff, but as I remember,
> the issue is that since the packet ring memory is shared with another
> domain which may be untrustworthy, we want to make copies of the headers
> before making any decisions based on them so that the other domain can't
> change them after header processing but before they're actually sent. 
> (The packet payload is considered less important, but of course the same
> issue applies if you're using some kind of content-aware packet filter.)
> 
> So that's the rationale for always copying RX_COPY_THRESHOLD, even if
> the packet is larger than that amount.  As far as I know, changing this
> behaviour wouldn't break the ring protocol, but it does introduce a
> potential security issue.
> 

David,

This leaves us somewhat in a pickle. If I'm reading this right (which I may
not be) it means that using __pskb_pull_tail() will work in the ideal case
but potentially introduces a subtle issue in the future. This bug could be
"fixed" in the driver by partially reverting [01c68026: xen: netfront:
convert to SKB paged frag API.]. That could be viewed as sweeping the
problem under the carpet but it does contain the problem to the xen-netfront
driver. A new helper could be created like __skb_clear_page_desc but that
is overkill for one driver and feels as ugly.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>