Re: [PATCH v3 0/3] RISC-V: mm: do not treat hint addr on mmap as the upper bound to search

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Aug 28, 2024, at 00:40, Charlie Jenkins <charlie@xxxxxxxxxxxx> wrote:
> 
> On Tue, Aug 27, 2024 at 09:33:11AM -0700, Palmer Dabbelt wrote:
>> On Tue, 27 Aug 2024 01:05:15 PDT (-0700), cyy@xxxxxxxxxxxx wrote:
>>> Previous patch series[1][2] changes a mmap behavior that treats the hint
>>> address as the upper bound of the mmap address range. The motivation of the
>>> previous patch series is that some user space software may assume 48-bit
>>> address space and use higher bits to encode some information, which may
>>> collide with large virtual address space mmap may return. However, to make
>>> sv48 by default, we don't need to change the meaning of the hint address on
>>> mmap as the upper bound of the mmap address range. This behavior breaks
>>> some user space software like Chromium that gets ENOMEM error when the hint
>>> address + size is not big enough, as specified in [3].
>>> 
>>> Other ISAs with larger than 48-bit virtual address space like x86, arm64,
>>> and powerpc do not have this special mmap behavior on hint address. They
>>> all just make 48-bit / 47-bit virtual address space by default, and if a
>>> user space software wants to large virtual address space, it only need to
>>> specify a hint address larger than 48-bit / 47-bit.
>>> 
>>> Thus, this patch series change mmap to use sv48 by default but does not
>>> treat the hint address as the upper bound of the mmap address range. After
>>> this patch, the behavior of mmap will align with existing behavior on other
>>> ISAs with larger than 48-bit virtual address space like x86, arm64, and
>>> powerpc. The user space software will no longer need to rewrite their code
>>> to fit with this special mmap behavior only on RISC-V.
>> 
>> So it actually looks like we just screwed up the original version of this:
>> the reason we went with the more complicated address splits were than we
>> actually started with a defacto 39-bit page table uABI (ie 38-bit user VAs),
>> and moving to even 48-bit page tables (ie, 47-bit user VAs) broke users
>> (here's an ASAN bug, for example:
>> https://github.com/google/android-riscv64/issues/64).
>> 
>> Unless I'm missing something, though, the code doesn't actually do that.  I
>> remember having that discussion at some point, but I must have forgotten to
>> make sure it worked.  As far as I can tell we've just moved to the 48-bit
>> VAs by default, which breaks the whole point of doing the compatibilty
>> stuff.  Probably a good sign I need to pay more attention to this stuff.
>> 
>> So I'm not really sure what to do here: we can just copy the arm64 behavior
>> at tell the other users that's just how things work, but then we're just
>> pushing around breakages.  At a certain point all we can really do with this
>> hint stuff is push around problems, though, and at least if we copy arm64
>> then most of those problems get reported as bugs for us.
> 
> Relying on the hint address in any capacity will push around breakages
> is my perspective as well. I messed this up from the start. I believe
> the only way to have consistent behavior is to mark mmap relying on the
> hint address as a bug, and only rely on the hint address if a flag
> defines the behavior.
> 

I agree with this. However, since we already have this behavior on
x86 and aarch64 for quite a long time, to prevent breaking userspace,
I think we can use this patch and then add a flag like MAP_VA_FULL
to enable full va address in the future.

Thanks,
Yangyu Chen

> There is an awkward window of releases that will have this "buggy"
> behavior. However, since the mmap changes introduced a variety of
> userspace bugs it seems acceptable to revert to the previous behavior
> and to create a consistent path forward.
> 
> - Charlie
> 
>> 
>>> Note: Charlie also created another series [4] to completely remove the
>>> arch_get_mmap_end and arch_get_mmap_base behavior based on the hint address
>>> and size. However, this will cause programs like Go and Java, which need to
>>> store information in the higher bits of the pointer, to fail on Sv57
>>> machines.
>>> 
>>> Changes in v3:
>>> - Rebase to newest master
>>> - Changes some information in cover letter after patchset [2]
>>> - Use patch [5] to patch selftests
>>> - Link to v2: https://lore.kernel.org/linux-riscv/tencent_B2D0435BC011135736262764B511994F4805@xxxxxx/
>>> 
>>> Changes in v2:
>>> - correct arch_get_mmap_end and arch_get_mmap_base
>>> - Add description in documentation about mmap behavior on kernel v6.6-6.7.
>>> - Improve commit message and cover letter
>>> - Rebase to newest riscv/for-next branch
>>> - Link to v1: https://lore.kernel.org/linux-riscv/tencent_F3B3B5AB1C9D704763CA423E1A41F8BE0509@xxxxxx/
>>> 
>>> [1] https://lore.kernel.org/linux-riscv/20230809232218.849726-1-charlie@xxxxxxxxxxxx/
>>> [2] https://lore.kernel.org/linux-riscv/20240130-use_mmap_hint_address-v3-0-8a655cfa8bcb@xxxxxxxxxxxx/
>>> [3] https://lore.kernel.org/linux-riscv/MEYP282MB2312A08FF95D44014AB78411C68D2@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>>> [4] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-0-cd8962afe47f@xxxxxxxxxxxx/
>>> [5] https://lore.kernel.org/linux-riscv/20240826-riscv_mmap-v1-2-cd8962afe47f@xxxxxxxxxxxx/
>>> 
>>> Charlie Jenkins (1):
>>>  riscv: selftests: Remove mmap hint address checks
>>> 
>>> Yangyu Chen (2):
>>>  RISC-V: mm: not use hint addr as upper bound
>>>  Documentation: riscv: correct sv57 kernel behavior
>>> 
>>> Documentation/arch/riscv/vm-layout.rst        | 43 ++++++++----
>>> arch/riscv/include/asm/processor.h            | 20 ++----
>>> .../selftests/riscv/mm/mmap_bottomup.c        |  2 -
>>> .../testing/selftests/riscv/mm/mmap_default.c |  2 -
>>> tools/testing/selftests/riscv/mm/mmap_test.h  | 67 -------------------
>>> 5 files changed, 36 insertions(+), 98 deletions(-)







[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux