Re: [PATCH] net: use hardware buffer pool to allocate skb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/16/2014 02:40 PM, Eric Dumazet wrote:
On Thu, 2014-10-16 at 11:20 -0700, Alexander Duyck wrote:

My concern would be that we are off by a factor of 2 and prematurely
collapse the TCP too soon with this change.
That is the opposite actually. We can consume 4K but we pretend we
consume 2K in some worst cases.

The only case where we consume the full 4K but only list it as 2K should be if we have memory from the wrong node and we want to flush it from the descriptor queue. For all other cases we should be using the page at least twice per buffer. So the the first page that was assigned for an Rx descriptor might be flushed but then after that reuse should take hold and stay in place as long as the NAPI poll doesn't change NUMA nodes.

That should be no worse than the case where the remaining space in a large page is not large enough to use as a buffer. You still use the current size as your truesize, you don't include the overhead of the unused space in your calculation.

  For example if you are
looking at a socket that is holding pages for a long period of time
there would be a good chance of it ending up with both halves of the
page.  In this case is it fair to charge it for 8K or memory use when in
reality it is only using 4K?
Its better to collapse too soon than too late.

If you want to avoid collapses because one host has plenty of memory,
all you need to do is increase tcp_rmem.

Why are you referring to 8K ? PAGE_SIZE is 4K

The truesize would be reported as 8K vs 4K for 2 half pages with your change if we were to hand off both halves of a page to the same socket.

The 2K value makes sense and is consistent with how we handle this in other cases where we are partitioning pages for use as network buffers. I think increasing this to 4K is just going to cause performance issues as flows are going to get choked off prematurely for memory usage that they aren't actually getting.

Part of my hesitation is that I spent the last couple of years explaining to our performance testing team and customers that they need to adjust tcp_rmem with all of the changes that have been made to truesize and the base network drivers, and I think I would prefer it if I didn't have to go another round of it. Then again I probably won't have to anyway since I am not doing drivers for Intel any more, but still my reaction to this kind of change is what it is.

Thanks,

Alex




--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux