On 2024/10/16 21:45, Kefeng Wang wrote:
On 2024/10/16 17:29, Baolin Wang wrote:
On 2024/10/16 15:49, Kefeng Wang wrote:
On 2024/10/10 17:58, Baolin Wang wrote:
Hi,
This RFC patch series attempts to support large folios for tmpfs.
Considering that tmpfs already has the 'huge=' option to control the
THP
allocation, it is necessary to maintain compatibility with the 'huge='
option, as well as considering the 'deny' and 'force' option controlled
by '/sys/kernel/mm/transparent_hugepage/shmem_enabled'.
Add a new huge option 'write_size' to support large folio allocation
based
on the write size for tmpfs write and fallocate paths. So the huge
pages
allocation strategy for tmpfs is that, if the 'huge=' option
(huge=always/within_size/advise) is enabled or the 'shmem_enabled'
option
is 'force', it need just allow PMD sized THP to keep backward
compatibility
for tmpfs. While 'huge=' option is disabled (huge=never) or the
'shmem_enabled'
option is 'deny', it will still disable any large folio allocations.
Only
when the 'huge=' option is 'write_size', it will allow allocating large
folios based on the write size.
And I think the 'huge=write_size' option should be the default behavior
for tmpfs in future.
Could we avoid new huge= option for tmpfs, maybe support other orders
for both read/write/fallocate if mount with huge?
Um, I am afraid not, as that would break the 'huge=' compatibility.
That is to say, users still want PMD-sized huge pages if 'huge=always'.
Yes, compatibility maybe an issue, but only write/fallocate side support
large folio is a little strange, maybe a new mode to support both read/
write/fallocate?
Because tmpfs read() will not allocate folios for tmpfs holes, and will
use ZERO_PAGE instead. If the shmem folios are swapped out, and now we
will always swapin base page, which is another story...
For tmpfs mmap() read, we do not have a length to indicate how large the
folio should be allocated. Moreover, we have decided against adding any
mTHP interfaces for tmpfs in the previous discussion[1].
[1] https://lore.kernel.org/all/ZvVRiJYfaXD645Nh@xxxxxxxxxxxxxxxxxxxx/