Re: [PATCH -next] mm: usercopy: add a debugfs interface to bypass the vmalloc check.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2024/12/3 21:10, zuoze wrote:


在 2024/12/3 20:39, Uladzislau Rezki 写道:
On Tue, Dec 03, 2024 at 07:23:44PM +0800, zuoze wrote:
We have implemented host-guest communication based on the TUN device
using XSK[1]. The hardware is a Kunpeng 920 machine (ARM architecture),
and the operating system is based on the 6.6 LTS version with kernel
version 6.6. The specific stack for hotspot collection is as follows:

-  100.00%     0.00%  vhost-12384  [unknown]      [k] 0000000000000000
    - ret_from_fork
       - 99.99% vhost_task_fn
          - 99.98% 0xffffdc59f619876c
             - 98.99% handle_rx_kick
                - 98.94% handle_rx
                   - 94.92% tun_recvmsg
                      - 94.76% tun_do_read
                         - 94.62% tun_put_user_xdp_zc
                            - 63.53% __check_object_size
                               - 63.49% __check_object_size.part.0
                                    find_vmap_area
                            - 30.02% _copy_to_iter
                                 __arch_copy_to_user
                   - 2.27% get_rx_bufs
                      - 2.12% vhost_get_vq_desc
                           1.49% __arch_copy_from_user
                   - 0.89% peek_head_len
                        0.54% xsk_tx_peek_desc
                   - 0.68% vhost_add_used_and_signal_n
                      - 0.53% eventfd_signal
                           eventfd_signal_mask
             - 0.94% handle_tx_kick
                - 0.94% handle_tx
                   - handle_tx_copy
                      - 0.59% vhost_tx_batch.constprop.0
                           0.52% tun_sendmsg

It can be observed that most of the overhead is concentrated in the
find_vmap_area function.

I see. Yes, it is pretty contented, since you run the v6.6 kernel. There
was a work that tends to improve it to mitigate a vmap lock contention.
See it here: https://lwn.net/Articles/956590/

The work was taken in the v6.9 kernel:

<snip>
commit 38f6b9af04c4b79f81b3c2a0f76d1de94b78d7bc
Author: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx>
Date:   Tue Jan 2 19:46:23 2024 +0100

     mm: vmalloc: add va_alloc() helper

     Patch series "Mitigate a vmap lock contention", v3.

     1. Motivation
...
<snip>

Could you please try the v6.9 kernel on your setup?

How to solve it, probably, it can be back-ported to the v6.6 kernel.

All the vmalloc-related optimizations have already been merged into 6.6,
including the set of optimization patches you suggested. Thank you very
much for your input.


It is unclear, we have backported the vmalloc optimization into our 6.6
kernel before, so the above stack already with those patches and even
with those optimization, the find_vmap_area() is still the hotpots.






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux