RE: [PATCH v7 4/4] mm: vmalloc: convert vread() to vread_iter()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Baoquan He
> Sent: 23 March 2023 13:32
...
> > > > If this fails, then we fault in, and try again. We loop because there could
> > > > be some extremely unfortunate timing with a race on e.g. swapping out or
> > > > migrating pages between faulting in and trying to write out again.
> > > >
> > > > This is extremely unlikely, but to avoid any chance of breaking userland we
> > > > repeat the operation until it completes. In nearly all real-world
> > > > situations it'll either work immediately or loop once.
> > >
> > > Thanks a lot for these helpful details with patience. I got it now. I was
> > > mainly confused by the while(true) loop in KCORE_VMALLOC case of read_kcore_iter.
> > >
> > > Now is there any chance that the faulted in memory is swapped out or
> > > migrated again before vread_iter()? fault_in_iov_iter_writeable() will
> > > pin the memory? I didn't find it from code and document. Seems it only
> > > falults in memory. If yes, there's window between faluting in and
> > > copy_to_user_nofault().
> > >
> >
> > See the documentation of fault_in_safe_writeable():
> >
> > "Note that we don't pin or otherwise hold the pages referenced that we fault
> > in.  There's no guarantee that they'll stay in memory for any duration of
> > time."
> 
> Thanks for the info. Then swapping out/migration could happen again, so
> that's why while(true) loop is meaningful.

One of the problems is that is the system is under severe memory
pressure and you try to fault in (say) 20 pages, the first page
might get unmapped in order to map the last one in.

So it is quite likely better to retry 'one page at a time'.

There have also been cases where the instruction to copy data
has faulted for reasons other than 'page fault'.
ISTR an infinite loop being caused by misaligned accesses failing
due to 'bad instruction choice' in the copy code.
While this is rally a bug, an infinite retry in a file read/write
didn't make it easy to spot.

So maybe there are cases where a dropping back to a 'bounce buffer'
may be necessary.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux