Re: [RFC PATCH] mm: add folio in swapcache if swapin from zswap

Chengming Zhou <chengming.zhou@xxxxxxxxx> · Sat, 23 Mar 2024 10:49:47 +0800

On 2024/3/23 03:37, Yosry Ahmed wrote:
> On Fri, Mar 22, 2024 at 9:40 AM <chengming.zhou@xxxxxxxxx> wrote:
>>
>> From: Chengming Zhou <chengming.zhou@xxxxxxxxx>
>>
>> There is a report of data corruption caused by double swapin, which is
>> only possible in the skip swapcache path on SWP_SYNCHRONOUS_IO backends.
>>
>> The root cause is that zswap is not like other "normal" swap backends,
>> it won't keep the copy of data after the first time of swapin. So if
>> the folio in the first time of swapin can't be installed in the pagetable
>> successfully and we just free it directly. Then in the second time of
>> swapin, we can't find anything in zswap and read wrong data from swapfile,
>> so this data corruption problem happened.
>>
>> We can fix it by always adding the folio into swapcache if we know the
>> pinned swap entry can be found in zswap, so it won't get freed even though
>> it can't be installed successfully in the first time of swapin.
> 
> A concurrent faulting thread could have already checked the swapcache
> before we add the folio to it, right? In this case, that thread will
> go ahead and call swap_read_folio() anyway.

Right, but it has to lock the folio to proceed.

> 
> Also, I suspect the zswap lookup might hurt performance. Would it be
> better to add the folio back to zswap upon failure? This should be
> detectable by checking if the folio is dirty as I mentioned in the bug
> report thread.

Yes, may hurt performance. As for adding back upon failure, the problem
is that adding may fail too... and I don't know how to handle that.

Anyway, I think the fix of Johannes is much better, we should take that way.

Thanks.