Re: [PATCH next 3/3] shmem: Fix "Unused swap" messages

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Tue, 4 Jan 2022 01:41:11 +0000




On Mon, Jan 03, 2022 at 12:10:21PM -0800, Hugh Dickins wrote:
> On Mon, 3 Jan 2022, Matthew Wilcox wrote:
> > On Sun, Jan 02, 2022 at 05:35:50PM -0800, Hugh Dickins wrote:
> > > shmem_swapin_page()'s swap_free() has occasionally been generating
> > > "_swap_info_get: Unused swap offset entry" messages.  Usually that's
> > > no worse than noise; but perhaps it indicates a worse case, when we
> > > might there be freeing swap already reused by others.
> > > 
> > > The multi-index xas_find_conflict() loop in shmem_add_to_page_cache()
> > > did not allow for entry found NULL when expected to be non-NULL, so did
> > > not catch that race when the swap has already been freed.
> > > 
> > > The loop would not actually catch a realistic conflict which the single
> > > check does not catch, so revert it back to the single check.
> > 
> > I think what led to the loop was concern for the xa_state if trying
> > to find a swap entry that's smaller than the size of the folio.
> > So yes, the loop was expected to execute twice, but I didn't consider
> > the case where we were looking for something non-NULL and actually found
> > NULL.
> > 
> > So should we actually call xas_find_conflict() twice (if we're looking
> > for something non-NULL), and check that we get @expected, followed by
> > NULL?
> 
> Sorry, I've no idea.
> 
> You say "twice", and that does not fit the imaginary model I had when I
> said "The loop would not actually catch a realistic conflict which the
> single check does not catch".
> 
> I was imagining it either looking at a single entry, or looking at an
> array of (perhaps sometimes in shmem's case 512) entries, looking for
> conflict with the supplied pointer/value expected there.
> 
> The loop technique was already unable to report on unexpected NULLs,
> and the single test would catch a non-NULL entry different from an
> expected non-NULL entry.  Its only relative weakness appeared to be
> if that array contained (perhaps some NULLs then) a "narrow" instance
> of the same pointer/value that was expected to fill the array; and I
> didn't see any possibility for shmem to be inserting small and large
> folios sharing the same address at the same time.
> 
> That "explanation" may make no sense to you, don't worry about it;
> just as "twice" makes no immediate sense to me - I'd have to go off
> and study multi-index XArray to make sense of it, which I'm not
> about to do.
> 
> I've seen no problems with the proposed patch, but if you see a real
> case that it's failing to cover, yes, please do improve it of course.
> 
> Though now I'm wondering if the "loop" totally misled me; and your
> "twice" just means that we need to test first this and then that and
> we're done - yeah, maybe.

Sorry; I wrote the above in a hurry and early in the morning, so probably
even less coherent than usual.  Also, the multi-index xarray code is
new to everyone (and adds new things to consider), so it's no surprise
we're talking past each other.  It's a bit strange for me to read your
explanations because you're only reading what I actually wrote instead
of what I intended to write.

So let me try again.  My concern was that we might be trying to store
a 2MB entry which had a non-NULL 'expected' entry which was found in a
4k (ie single-index) slot within the 512 entries (as the first non-NULL
entry in that range), and we'd then store the 2MB entry into a
single-entry slot.

Now, maybe that can't happen for higher-level reasons, and I don't need
to worry about it.  But I feel like we should check for that?  Anyway,
I think the right fix is this:

+++ b/mm/shmem.c
@@ -733,11 +733,12 @@ static int shmem_add_to_page_cache(struct page *page,
        cgroup_throttle_swaprate(page, gfp);

        do {
-               void *entry;
                xas_lock_irq(&xas);
-               while ((entry = xas_find_conflict(&xas)) != NULL) {
-                       if (entry == expected)
-                               continue;
+               if (expected != xas_find_conflict(&xas)) {
+                       xas_set_err(&xas, -EEXIST);
+                       goto unlock;
+               }
+               if (expected && xas_find_conflict(&xas)) {
                        xas_set_err(&xas, -EEXIST);
                        goto unlock;
                }

which says what I mean.  I certainly didn't intend to imply that I
was expecting to see 512 consecutive entries which were all identical,
which would be the idiomatic way to read the code that was there before.
I shouldn't've tried to be so concise.

(If you'd rather I write any of this differently, I'm more than happy
to change it)