Re: [PATCH 6/6] shmem: add large folios support to the write path

Yosry Ahmed <yosryahmed@xxxxxxxxxx> · Tue, 19 Sep 2023 14:51:40 -0700

On Tue, Sep 19, 2023 at 2:47 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote:
>
> On Tue, Sep 19, 2023 at 09:00:16AM -0700, Yosry Ahmed wrote:
> > On Tue, Sep 19, 2023 at 6:27 AM Daniel Gomez <da.gomez@xxxxxxxxxxx> wrote:
> > >
> > > On Mon, Sep 18, 2023 at 11:55:34AM -0700, Yosry Ahmed wrote:
> > > > On Mon, Sep 18, 2023 at 1:00 AM Daniel Gomez <da.gomez@xxxxxxxxxxx> wrote:
> > > > >
> > > > > On Fri, Sep 15, 2023 at 11:26:37AM -0700, Yosry Ahmed wrote:
> > > > > > On Fri, Sep 15, 2023 at 2:51 AM Daniel Gomez <da.gomez@xxxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > Add large folio support for shmem write path matching the same high
> > > > > > > order preference mechanism used for iomap buffered IO path as used in
> > > > > > > __filemap_get_folio().
> > > > > > >
> > > > > > > Use the __folio_get_max_order to get a hint for the order of the folio
> > > > > > > based on file size which takes care of the mapping requirements.
> > > > > > >
> > > > > > > Swap does not support high order folios for now, so make it order 0 in
> > > > > > > case swap is enabled.
> > > > > >
> > > > > > I didn't take a close look at the series, but I am not sure I
> > > > > > understand the rationale here. Reclaim will split high order shmem
> > > > > > folios anyway, right?
> > > > >
> > > > > For context, this is part of the enablement of large block sizes (LBS)
> > > > > effort [1][2][3], so the assumption here is that the kernel will
> > > > > reclaim memory with the same (large) block sizes that were written to
> > > > > the device.
> > > > >
> > > > > I'll add more context in the V2.
> > > > >
> > > > > [1] https://protect2.fireeye.com/v1/url?k=a80aab33-c981be05-a80b207c-000babff9b5d-b656d8860b04562f&q=1&e=46666acf-d70d-4e8d-8d00-b027808ae400&u=https%3A%2F%2Fkernelnewbies.org%2FKernelProjects%2Flarge-block-size
> > > > > [2] https://protect2.fireeye.com/v1/url?k=3f753ca2-5efe2994-3f74b7ed-000babff9b5d-e678f885471555e3&q=1&e=46666acf-d70d-4e8d-8d00-b027808ae400&u=https%3A%2F%2Fdocs.google.com%2Fspreadsheets%2Fd%2Fe%2F2PACX-1vS7sQfw90S00l2rfOKm83Jlg0px8KxMQE4HHp_DKRGbAGcAV-xu6LITHBEc4xzVh9wLH6WM2lR0cZS8%2Fpubhtml%23
> > > > > [3] https://lore.kernel.org/all/ZQfbHloBUpDh+zCg@xxxxxxxxxxxxxxxxxxx/
> > > > > >
> > > > > > It seems like we only enable high order folios if the "noswap" mount
> > > > > > option is used, which is fairly recent. I doubt it is widely used.
> > > > >
> > > > > For now, I skipped the swap path as it currently lacks support for
> > > > > high order folios. But I'm currently looking into it as part of the LBS
> > > > > effort (please check spreadsheet at [2] for that).
> > > >
> > > > Thanks for the context, but I am not sure I understand.
> > > >
> > > > IIUC we are skipping allocating large folios in shmem if swap is
> > > > enabled in this patch. Swap does not support swapping out large folios
> > > > as a whole (except THPs), but page reclaim will split those large
> > > > folios and swap them out as order-0 pages anyway. So I am not sure I
> > > > understand why we need to skip allocating large folios if swap is
> > > > enabled.
> > >
> > > I lifted noswap condition and retested it again on top of 230918 and
> > > there is some regression. So, based on the results I guess the initial
> > > requirement may be the way to go. But what do you think?
> > >
> > > Here the logs:
> > > * shmem-large-folios-swap: https://gitlab.com/-/snippets/3600360
> > > * shmem-baseline-swap : https://gitlab.com/-/snippets/3600362
> > >
> > > -Failures: generic/080 generic/126 generic/193 generic/633 generic/689
> > > -Failed 5 of 730 tests
> > > \ No newline at end of file
> > > +Failures: generic/080 generic/103 generic/126 generic/193 generic/285 generic/436 generic/619 generic/633 generic/689
> > > +Failed 9 of 730 tests
> > > \ No newline at end of file
> > > >
> >
> > I am not really familiar with these tests so I cannot really tell
> > what's going on. I can see "swapfiles are not supported" in the logs
> > though, so it seems like we are seeing extra failures by just lifting
> > "noswap" even without actually swapping. I am curious if this is just
> > hiding a different issue, I would at least try to understand what's
> > happening.
> >
> > Anyway, I don't have enough context here to be useful. I was just
> > making an observation about reclaim splitting shmem folios to swap
> > them out as order-0 pages, and asking why this is needed based on
> > that. I will leave it up to you and the reviewers to decide if there's
> > anything interesting here.
>
> The tests which are failing seem be related to permissions, I could not
> immediate decipher why, because as you suggest we'd just be doing the
> silly thing of splitting large folios on writepage.
>
> I'd prefer we don't require swap until those regressions would be fixed.
>
> Note that part of the rationale to enable this work is to eventually
> also extend swap code to support large order folios, so it is not like
> this would be left as-is. It is just that it may take time to resolve
> the kinks with swap.
>
> So I'd stick to nowap for now.
>
> The above tests also don't stress swap too, and if we do that I would
> imagine we might see some other undesirable failures.
>
>  Luis

I thought we already have some notion of exercising swap with large
shmem folios from THPs, so this shouldn't be new, but perhaps I am
missing something.