Re: [PATCH v6 4/5] mm/migrate: skip migrating folios under writeback with AS_WRITEBACK_INDETERMINATE mappings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 19, 2024 at 11:14:49AM -0500, Zi Yan wrote:
> On 19 Dec 2024, at 11:09, Bernd Schubert wrote:
> 
> > On 12/19/24 17:02, Zi Yan wrote:
> >> On 19 Dec 2024, at 11:00, Zi Yan wrote:
> >>> On 19 Dec 2024, at 10:56, Bernd Schubert wrote:
> >>>
> >>>> On 12/19/24 16:55, Zi Yan wrote:
> >>>>> On 19 Dec 2024, at 10:53, Shakeel Butt wrote:
> >>>>>
> >>>>>> On Thu, Dec 19, 2024 at 04:47:18PM +0100, David Hildenbrand wrote:
> >>>>>>> On 19.12.24 16:43, Shakeel Butt wrote:
> >>>>>>>> On Thu, Dec 19, 2024 at 02:05:04PM +0100, David Hildenbrand wrote:
> >>>>>>>>> On 23.11.24 00:23, Joanne Koong wrote:
> >>>>>>>>>> For migrations called in MIGRATE_SYNC mode, skip migrating the folio if
> >>>>>>>>>> it is under writeback and has the AS_WRITEBACK_INDETERMINATE flag set on its
> >>>>>>>>>> mapping. If the AS_WRITEBACK_INDETERMINATE flag is set on the mapping, the
> >>>>>>>>>> writeback may take an indeterminate amount of time to complete, and
> >>>>>>>>>> waits may get stuck.
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Joanne Koong <joannelkoong@xxxxxxxxx>
> >>>>>>>>>> Reviewed-by: Shakeel Butt <shakeel.butt@xxxxxxxxx>
> >>>>>>>>>> ---
> >>>>>>>>>>    mm/migrate.c | 5 ++++-
> >>>>>>>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/mm/migrate.c b/mm/migrate.c
> >>>>>>>>>> index df91248755e4..fe73284e5246 100644
> >>>>>>>>>> --- a/mm/migrate.c
> >>>>>>>>>> +++ b/mm/migrate.c
> >>>>>>>>>> @@ -1260,7 +1260,10 @@ static int migrate_folio_unmap(new_folio_t get_new_folio,
> >>>>>>>>>>    		 */
> >>>>>>>>>>    		switch (mode) {
> >>>>>>>>>>    		case MIGRATE_SYNC:
> >>>>>>>>>> -			break;
> >>>>>>>>>> +			if (!src->mapping ||
> >>>>>>>>>> +			    !mapping_writeback_indeterminate(src->mapping))
> >>>>>>>>>> +				break;
> >>>>>>>>>> +			fallthrough;
> >>>>>>>>>>    		default:
> >>>>>>>>>>    			rc = -EBUSY;
> >>>>>>>>>>    			goto out;
> >>>>>>>>>
> >>>>>>>>> Ehm, doesn't this mean that any fuse user can essentially completely block
> >>>>>>>>> CMA allocations, memory compaction, memory hotunplug, memory poisoning... ?!
> >>>>>>>>>
> >>>>>>>>> That sounds very bad.
> >>>>>>>>
> >>>>>>>> The page under writeback are already unmovable while they are under
> >>>>>>>> writeback. This patch is only making potentially unrelated tasks to
> >>>>>>>> synchronously wait on writeback completion for such pages which in worst
> >>>>>>>> case can be indefinite. This actually is solving an isolation issue on a
> >>>>>>>> multi-tenant machine.
> >>>>>>>>
> >>>>>>> Are you sure, because I read in the cover letter:
> >>>>>>>
> >>>>>>> "In the current FUSE writeback design (see commit 3be5a52b30aa ("fuse:
> >>>>>>> support writable mmap"))), a temp page is allocated for every dirty
> >>>>>>> page to be written back, the contents of the dirty page are copied over to
> >>>>>>> the temp page, and the temp page gets handed to the server to write back.
> >>>>>>> This is done so that writeback may be immediately cleared on the dirty
> >>>>>>> page,"
> >>>>>>>
> >>>>>>> Which to me means that they are immediately movable again?
> >>>>>>
> >>>>>> Oh sorry, my mistake, yes this will become an isolation issue with the
> >>>>>> removal of the temp page in-between which this series is doing. I think
> >>>>>> the tradeoff is between extra memory plus slow write performance versus
> >>>>>> temporary unmovable memory.
> >>>>>
> >>>>> No, the tradeoff is slow FUSE performance vs whole system slowdown due to
> >>>>> memory fragmentation. AS_WRITEBACK_INDETERMINATE indicates it is not
> >>>>> temporary.
> >>>>
> >>>> Is there is a difference between FUSE TMP page being unmovable and
> >>>> AS_WRITEBACK_INDETERMINATE folios/pages being unmovable?
> >>
> >> (Fix my response location)
> >>
> >> Both are unmovable, but you can control where FUSE TMP page
> >> can come from to avoid spread across the entire memory space. For example,
> >> allocate a contiguous region as a TMP page pool.
> >
> > Wouldn't it make sense to have that for fuse writeback pages as well?
> > Fuse tries to limit dirty pages anyway.
> 
> Can fuse constraint the location of writeback pages? Something like what
> I proposed[1], migrating pages to a location before their writeback? Will
> that be a performance concern?
> 
> In terms of the number of dirty pages, you only need one page out of 512
> pages to prevent 2MB THP from allocation. For CMA allocation, one unmovable
> page can kill one contiguous range. What is the limit of fuse dirty pages?
> 
> [1] https://lore.kernel.org/linux-mm/90C41581-179F-40B6-9801-9C9DBBEB1AF4@xxxxxxxxxx/

I think this whole concern of fuse making system memory unmovable
forever is overblown. Fuse is already using a temp (unmovable) page for
the writeback and is slow and is being removed in this series.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux