On Mon, Sep 18, 2023 at 1:00 AM Daniel Gomez <da.gomez@xxxxxxxxxxx> wrote: > > On Fri, Sep 15, 2023 at 11:26:37AM -0700, Yosry Ahmed wrote: > > On Fri, Sep 15, 2023 at 2:51 AM Daniel Gomez <da.gomez@xxxxxxxxxxx> wrote: > > > > > > Add large folio support for shmem write path matching the same high > > > order preference mechanism used for iomap buffered IO path as used in > > > __filemap_get_folio(). > > > > > > Use the __folio_get_max_order to get a hint for the order of the folio > > > based on file size which takes care of the mapping requirements. > > > > > > Swap does not support high order folios for now, so make it order 0 in > > > case swap is enabled. > > > > I didn't take a close look at the series, but I am not sure I > > understand the rationale here. Reclaim will split high order shmem > > folios anyway, right? > > For context, this is part of the enablement of large block sizes (LBS) > effort [1][2][3], so the assumption here is that the kernel will > reclaim memory with the same (large) block sizes that were written to > the device. > > I'll add more context in the V2. > > [1] https://kernelnewbies.org/KernelProjects/large-block-size > [2] https://docs.google.com/spreadsheets/d/e/2PACX-1vS7sQfw90S00l2rfOKm83Jlg0px8KxMQE4HHp_DKRGbAGcAV-xu6LITHBEc4xzVh9wLH6WM2lR0cZS8/pubhtml# > [3] https://lore.kernel.org/all/ZQfbHloBUpDh+zCg@xxxxxxxxxxxxxxxxxxx/ > > > > It seems like we only enable high order folios if the "noswap" mount > > option is used, which is fairly recent. I doubt it is widely used. > > For now, I skipped the swap path as it currently lacks support for > high order folios. But I'm currently looking into it as part of the LBS > effort (please check spreadsheet at [2] for that). Thanks for the context, but I am not sure I understand. IIUC we are skipping allocating large folios in shmem if swap is enabled in this patch. Swap does not support swapping out large folios as a whole (except THPs), but page reclaim will split those large folios and swap them out as order-0 pages anyway. So I am not sure I understand why we need to skip allocating large folios if swap is enabled. > > > > > > > > Signed-off-by: Daniel Gomez <da.gomez@xxxxxxxxxxx> > > > --- > > > mm/shmem.c | 16 +++++++++++++--- > > > 1 file changed, 13 insertions(+), 3 deletions(-) > > > > > > diff --git a/mm/shmem.c b/mm/shmem.c > > > index adff74751065..26ca555b1669 100644 > > > --- a/mm/shmem.c > > > +++ b/mm/shmem.c > > > @@ -1683,13 +1683,19 @@ static struct folio *shmem_alloc_folio(gfp_t gfp, > > > } > > > > > > static struct folio *shmem_alloc_and_acct_folio(gfp_t gfp, struct inode *inode, > > > - pgoff_t index, bool huge, unsigned int *order) > > > + pgoff_t index, bool huge, unsigned int *order, > > > + struct shmem_sb_info *sbinfo) > > > { > > > struct shmem_inode_info *info = SHMEM_I(inode); > > > struct folio *folio; > > > int nr; > > > int err; > > > > > > + if (!sbinfo->noswap) > > > + *order = 0; > > > + else > > > + *order = (*order == 1) ? 0 : *order; > > > + > > > if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) > > > huge = false; > > > nr = huge ? HPAGE_PMD_NR : 1U << *order; > > > @@ -2032,6 +2038,8 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > > return 0; > > > } > > > > > > + order = mapping_size_order(inode->i_mapping, index, len); > > > + > > > if (!shmem_is_huge(inode, index, false, > > > vma ? vma->vm_mm : NULL, vma ? vma->vm_flags : 0)) > > > goto alloc_nohuge; > > > @@ -2039,11 +2047,11 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > > huge_gfp = vma_thp_gfp_mask(vma); > > > huge_gfp = limit_gfp_mask(huge_gfp, gfp); > > > folio = shmem_alloc_and_acct_folio(huge_gfp, inode, index, true, > > > - &order); > > > + &order, sbinfo); > > > if (IS_ERR(folio)) { > > > alloc_nohuge: > > > folio = shmem_alloc_and_acct_folio(gfp, inode, index, false, > > > - &order); > > > + &order, sbinfo); > > > } > > > if (IS_ERR(folio)) { > > > int retry = 5; > > > @@ -2147,6 +2155,8 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > > if (folio_test_large(folio)) { > > > folio_unlock(folio); > > > folio_put(folio); > > > + if (order > 0) > > > + order--; > > > goto alloc_nohuge; > > > } > > > unlock: > > > -- > > > 2.39.2 > > >