Re: [PATCH v2] tmpfs: fault in smaller chunks if large folio allocation not allowed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2024/9/30 14:48, Baolin Wang wrote:


On 2024/9/30 11:15, Kefeng Wang wrote:


On 2024/9/30 10:52, Baolin Wang wrote:


On 2024/9/30 10:30, Kefeng Wang wrote:


On 2024/9/30 10:02, Baolin Wang wrote:


On 2024/9/26 21:52, Matthew Wilcox wrote:
On Thu, Sep 26, 2024 at 10:38:34AM +0200, Pankaj Raghav (Samsung) wrote:
So this is why I don't use mapping_set_folio_order_range() here, but
correct me if I am wrong.

Yeah, the inode is active here as the max folio size is decided based on the write size, so probably mapping_set_folio_order_range() will not be
a safe option.

You really are all making too much of this.  Here's the patch I think we
need:

+++ b/mm/shmem.c
@@ -2831,7 +2831,8 @@ static struct inode *__shmem_get_inode(struct mnt_idmap *idmap,
         cache_no_acl(inode);
         if (sbinfo->noswap)
                 mapping_set_unevictable(inode->i_mapping);
-       mapping_set_large_folios(inode->i_mapping);
+       if (sbinfo->huge)
+               mapping_set_large_folios(inode->i_mapping);

         switch (mode & S_IFMT) {
         default:

IMHO, we no longer need the the 'sbinfo->huge' validation after adding support for large folios in the tmpfs write and fallocate paths [1].

Forget to mention, we still need to check sbinfo->huge, if mount with
huge=never, but we fault in large chunk, write is slower than without
9aac777aaf94, the above changes or my patch could fix it.

My patch will allow allocating large folios in the tmpfs write and fallocate paths though the 'huge' option is 'never'.

Yes, indeed after checking your patch,

The Writing intelligently from 'Bonnie -d /mnt/tmpfs/ -s 1024' based on next-20241008,

1) huge=never
   the base:                                    2016438 K/Sec
   my v1/v2 or Matthew's patch :                2874504 K/Sec
   your patch with filemap_get_order() fix:     6330604 K/Sec

2) huge=always
   the write performance:                       7168917 K/Sec

Since large folios supported in the tmpfs write, we do have better performance shown above, that's great.


My initial thought for supporting large folio is that, if the 'huge' option is enabled, to maintain backward compatibility, we only allow 2M PMD-sized order allocations. If the 'huge' option is disabled(huge=never), we still allow large folio allocations based on the write length.

Another choice is to allow the different sized large folio allocation based on the write length when the 'huge' option is enabled, rather than just the 2M PMD sized. But will force the huge orders off if 'huge' option is disabled.


"huge=never  Do not allocate huge pages. This is the default."
From the document, it's better not to allocate large folio, but we need
some special handle for huge=never or runtime deny/force.

Still need some discussions to determine which method is preferable.

Personally. I like your current implementation, but it does not match document.






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux