Re: [PATCH v5 0/6] add mTHP support for anonymous shmem

Ryan Roberts <ryan.roberts@xxxxxxx> · Fri, 5 Jul 2024 10:13:07 +0100



On 05/07/2024 09:59, David Hildenbrand wrote:
> On 05.07.24 10:45, Ryan Roberts wrote:
>> On 05/07/2024 06:47, Baolin Wang wrote:
>>>
>>>
>>> On 2024/7/5 03:49, Matthew Wilcox wrote:
>>>> On Thu, Jul 04, 2024 at 09:19:10PM +0200, David Hildenbrand wrote:
>>>>> On 04.07.24 21:03, David Hildenbrand wrote:
>>>>>>> shmem has two uses:
>>>>>>>
>>>>>>>      - MAP_ANONYMOUS | MAP_SHARED (this patch set)
>>>>>>>      - tmpfs
>>>>>>>
>>>>>>> For the second use case we don't want controls *at all*, we want the
>>>>>>> same heiristics used for all other filesystems to apply to tmpfs.
>>>>>>
>>>>>> As discussed in the MM meeting, Hugh had a different opinion on that.
>>>>>
>>>>> FWIW, I just recalled that I wrote a quick summary:
>>>>>
>>>>> https://lkml.kernel.org/r/f1783ff0-65bd-4b2b-8952-52b6822a0835@xxxxxxxxxx
>>>>>
>>>>> I believe the meetings are recorded as well, but never looked at recordings.
>>>>
>>>> That's not what I understood Hugh to mean.  To me, it seemed that Hugh
>>>> was expressing an opinion on using shmem as shmem, not as using it as
>>>> tmpfs.
>>>>
>>>> If I misunderstood Hugh, well, I still disagree.  We should not have
>>>> separate controls for this.  tmpfs is just not that special.
>>
>> I wasn't at the meeting that's being referred to, but I thought we previously
>> agreed that tmpfs *is* special because in some configurations its not backed by
>> swap so is locked in ram?
> 
> There are multiple things to that, like:
> 
> * Machines only having limited/no swap configured
> * tmpfs can be configured to never go to swap
> * memfd/tmpfs files getting used purely for mmap(): there is no real
>   difference to MAP_ANON|MAP_SHARE besides the processes we share that
>   memory with.
> 
> Especially when it comes to memory waste concerns and access behavior in some
> cases, tmpfs behaved much more like anonymous memory. But there are for sure
> other use cases where tmpfs is not that special.
> 
> My opinion is that we need to let people configure orders (if you feel like it,
> configure all), but *select* the order to allocate based on readahead
> information -- in contrast to anonymous memory where we start at the highest
> order and don't have readahead information available.

That approach is exactly what I proposed to start playing with yesterday [1] for
regular pagecache folio allocations too :)

[1] https://lore.kernel.org/linux-mm/bdde4008-60db-4717-a6b5-53d77ab76bdb@xxxxxxx/

> 
> Maybe we need different "order allcoation" logic for read/write vs. fault, not
> sure.
> 
> But I don't maintain that code, so I can only give stupid suggestions and repeat
> what I understood from the meeting with Hugh and Kirill :)
>