Re: [PATCH v1] fs/fuse: Fix missing FOLL_PIN for direct-io

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you very much, Miklos!

Yes. It is not easy to reproduce the issues in real applications. We only observed the issue in our own testing tool which runs multiple tests concurrently. We have not been able reproduce it with simple code yet.

-lei

On 3/6/24 07:05, Miklos Szeredi wrote:
On Wed, 6 Mar 2024 at 12:16, Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote:



On 3/6/24 11:01, Miklos Szeredi wrote:
On Tue, 29 Aug 2023 at 20:37, Lei Huang <lei.huang@xxxxxxxxxxxxxxx> wrote:

Our user space filesystem relies on fuse to provide POSIX interface.
In our test, a known string is written into a file and the content
is read back later to verify correct data returned. We observed wrong
data returned in read buffer in rare cases although correct data are
stored in our filesystem.

Fuse kernel module calls iov_iter_get_pages2() to get the physical
pages of the user-space read buffer passed in read(). The pages are
not pinned to avoid page migration. When page migration occurs, the
consequence are two-folds.

1) Applications do not receive correct data in read buffer.
2) fuse kernel writes data into a wrong place.

Using iov_iter_extract_pages() to pin pages fixes the issue in our
test.

An auxiliary variable "struct page **pt_pages" is used in the patch
to prepare the 2nd parameter for iov_iter_extract_pages() since
iov_iter_get_pages2() uses a different type for the 2nd parameter.

Signed-off-by: Lei Huang <lei.huang@xxxxxxxxxxxxxxx>

Applied, with a modification to only unpin if
iov_iter_extract_will_pin() returns true.

Hi Miklos,

do you have an idea if this needs to be back ported and to which kernel
version?
I had tried to reproduce data corruption with 4.18 - Lei wrote that he
could see issues with older kernels as well, but I never managed to
trigger anything on 4.18-RHEL. Typically I use ql-fstest
(https://github.com/bsbernd/ql-fstest) and even added random DIO as an
option - nothing report with weeks of run time. I could try again with
more recent kernels that have folios.

I don't think that corruption will happen in real life.  So I'm not
sure we need to bother with backporting, and definitely not before
when the infrastructure was introduced.

Thanks,
Miklos




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux