Re: [PATCH net v2 0/2] Revert the 'socket_alloc' life cycle change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 5, 2020 at 4:54 AM SeongJae Park <sjpark@xxxxxxxxxx> wrote:
>
> CC-ing stable@xxxxxxxxxxxxxxx and adding some more explanations.
>
> On Tue, 5 May 2020 10:10:33 +0200 SeongJae Park <sjpark@xxxxxxxxxx> wrote:
>
> > From: SeongJae Park <sjpark@xxxxxxxxx>
> >
> > The commit 6d7855c54e1e ("sockfs: switch to ->free_inode()") made the
> > deallocation of 'socket_alloc' to be done asynchronously using RCU, as
> > same to 'sock.wq'.  And the following commit 333f7909a857 ("coallocate
> > socket_sq with socket itself") made those to have same life cycle.
> >
> > The changes made the code much more simple, but also made 'socket_alloc'
> > live longer than before.  For the reason, user programs intensively
> > repeating allocations and deallocations of sockets could cause memory
> > pressure on recent kernels.
>
> I found this problem on a production virtual machine utilizing 4GB memory while
> running lebench[1].  The 'poll big' test of lebench opens 1000 sockets, polls
> and closes those.  This test is repeated 10,000 times.  Therefore it should
> consume only 1000 'socket_alloc' objects at once.  As size of socket_alloc is
> about 800 Bytes, it's only 800 KiB.  However, on the recent kernels, it could
> consume up to 10,000,000 objects (about 8 GiB).  On the test machine, I
> confirmed it consuming about 4GB of the system memory and results in OOM.
>
> [1] https://github.com/LinuxPerfStudy/LEBench

To be fair, I have not backported Al patches to Google production
kernels, nor I have tried this benchmark.

Why do we have 10,000,000 objects around ? Could this be because of
some RCU problem ?

Once Al patches reverted, do you have 10,000,000 sock_alloc around ?

Thanks.

>
> >
> > To avoid the problem, this commit reverts the changes.
>
> I also tried to make fixup rather than reverts, but I couldn't easily find
> simple fixup.  As the commits 6d7855c54e1e and 333f7909a857 were for code
> refactoring rather than performance optimization, I thought introducing complex
> fixup for this problem would make no sense.  Meanwhile, the memory pressure
> regression could affect real machines.  To this end, I decided to quickly
> revert the commits first and consider better refactoring later.
>
>
> Thanks,
> SeongJae Park
>
> >
> > SeongJae Park (2):
> >   Revert "coallocate socket_wq with socket itself"
> >   Revert "sockfs: switch to ->free_inode()"
> >
> >  drivers/net/tap.c      |  5 +++--
> >  drivers/net/tun.c      |  8 +++++---
> >  include/linux/if_tap.h |  1 +
> >  include/linux/net.h    |  4 ++--
> >  include/net/sock.h     |  4 ++--
> >  net/core/sock.c        |  2 +-
> >  net/socket.c           | 23 ++++++++++++++++-------
> >  7 files changed, 30 insertions(+), 17 deletions(-)
> >
> > --
> > 2.17.1



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux