On Mon, Jul 09, 2012 at 06:57:10PM +0200, Sebastian Andrzej Siewior wrote: > On Mon, Jul 09, 2012 at 11:04:42AM +0100, Mel Gorman wrote: > > > - lets assume your allocation happens with kmalloc() without __GFP_MEMALLOC > > > and current->flags has PF_MEMALLOC ORed and your SLAB pool is empty. This > > > forces SLAB to allocate more pages from the buddy allocator with it will > > > receive more likely (due to ->current->flags + PF_MEMALLOC) but SLAB will > > > drop this extra memory because the page has ->pf_memory (or something like > > > that) set and the GFP_FLAGS do not have __GFP_MEMALLOC set. > > > > > > > It's recorded if the slab page was allocated from PFMEMALLOC reserves (see > > patch 2 from the swap over NBD series). slab will use this page for objects > > but only allocate them to callers that pass a gfp_pfmemalloc_allowed() check. > > kmalloc() users with either __GFP_MEMALLOC or PF_MEMALLOC will get > > the pages they need but they will not "leak" to !_GFP_MEMALLOC users as > > that would potentially deadlock. > > Argh, I missed that gfp_to_alloc_flags() is not only called from > within the buddy allocater but also from slab. So this is fine then :) > Good to hear. I appreciate you taking the time to give it a solid review like this looking for holes. > One thing: > You only get current->flags |= PF_MEMALLOC in softirq _if_ the skb, which is > passed to netif_receive_skb(), was allocated with __GFP_MEMALLOC. That > means if the NIC's RX allocation did not require an allocation from the > emergency pool (without ->pfmemalloc set) then you never use this extra > pool, even if this skb would end up in your swap socket. Also, the other way > around, where you allocate it from the emergency pool but it is a user > socket and you could drop it. > While there is a possibility that packets may get dropped later like this, they still get retransmitted and eventually it'll get through. This is not optimal but optimised swap-over-network was not the primary goal of the series, deadlock avoidance was. > What about extending sk_set_memalloc() to record socket's ips + ports > in a separate list so that skb_pfmemalloc_protocol() might use that > information and decide on per-protocol basis if the skb is worth to > spend more ressource to deliver it. That means you would enable the > extra pool if the currently received skb is part of your swap socket and > not if the skb was allocated from the emergency pool. > > That said, there is nothing wrong with the code as of now and this > optimization could be added later (if at all). > I think it is a good idea but it could also be done later iff a user had a serious problem with the performance and that this made a measurable difference. The series is already quite complex and I'd rather not add to that complexity without strong motivation. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>