Re: [PATCH net-next] mlx4: support __GFP_MEMALLOC for rx

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18.01.2017 17:23, Eric Dumazet wrote:
On Wed, 2017-01-18 at 12:31 +0300, Konstantin Khlebnikov wrote:
On 18.01.2017 07:14, Eric Dumazet wrote:
From: Eric Dumazet <edumazet@xxxxxxxxxx>

Commit 04aeb56a1732 ("net/mlx4_en: allocate non 0-order pages for RX
ring with __GFP_NOMEMALLOC") added code that appears to be not needed at
that time, since mlx4 never used __GFP_MEMALLOC allocations anyway.

As using memory reserves is a must in some situations (swap over NFS or
iSCSI), this patch adds this flag.

AFAIK __GFP_MEMALLOC is used for TX, not for RX: for allocations which
are required by memory reclaimer to free some pages.

Allocation RX buffers with __GFP_MEMALLOC is a straight way to
depleting all reserves by flood from network.

You are mistaken.

How do you think a TCP flow can make progress sending data if no ACK
packet can go back in RX ?

Well. Ok. I mistaken.


Take a look at sk_filter_trim_cap(), where the RX packets received on a
socket which does not have SOCK_MEMALLOC is dropped.

        /*
         * If the skb was allocated from pfmemalloc reserves, only
         * allow SOCK_MEMALLOC sockets to use it as this socket is
         * helping free memory
         */
        if (skb_pfmemalloc(skb) && !sock_flag(sk, SOCK_MEMALLOC))
                return -ENOMEM;

I suppose this happens in BH context right after receiving packet?

Potentially any ACK could free memory in TCP send queue,
so using reserves here makes sense.


Also take a look at __dev_alloc_pages()

static inline struct page *__dev_alloc_pages(gfp_t gfp_mask,
                                             unsigned int order)
{
        /* This piece of code contains several assumptions.
         * 1.  This is for device Rx, therefor a cold page is preferred.
         * 2.  The expectation is the user wants a compound page.
         * 3.  If requesting a order 0 page it will not be compound
         *     due to the check to see if order has a value in prep_new_page
         * 4.  __GFP_MEMALLOC is ignored if __GFP_NOMEMALLOC is set due to
         *     code in gfp_to_alloc_flags that should be enforcing this.
         */
        gfp_mask |= __GFP_COLD | __GFP_COMP | __GFP_MEMALLOC;

        return alloc_pages_node(NUMA_NO_NODE, gfp_mask, order);
}


So __GFP_MEMALLOC in RX is absolutely supported.

But drivers have to opt-in, either using __dev_alloc_pages() or
dev_alloc_pages, or explicitely ORing __GFP_MEMALLOC when using
alloc_page[s]()





--
Konstantin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]