Re: [RFC PATCH 18/18] mm: Add pfn_range_put()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/09/2023 05:27, Matthew Wilcox wrote:
> On Thu, Aug 31, 2023 at 08:03:14PM +0100, Ryan Roberts wrote:
>>> +++ b/include/linux/mm.h
>>> @@ -1515,6 +1515,13 @@ typedef union {
>>>  void release_pages(release_pages_arg, int nr);
>>>  void folios_put(struct folio_batch *folios);
>>>  
>>> +struct pfn_range {
>>> +	unsigned long start;
>>> +	unsigned long end;
>>> +};
>>
>> As mentioned in the other thread, I think we need to better convey that a
>> pfn_range must be fully contained within a single folio. Perhaps its better to
>> call it `struct folio_region`?
> 
> Yep, I'm not taking a strong position on that.  We can call it whatever
> you like, subject to usual nitpicking later.
> 
>>> +
>>> +void pfn_range_put(struct pfn_range *range, unsigned int nr);
>>
>> If going with `struct folio_region`, we could call this folios_put_refs()?
>>
>> Or if you hate that idea and want to stick with pfn, then at least call it
>> pfn_ranges_put() (with s on ranges).
> 
> folio_regions_put()?  I don't know at this point; nothing's speaking
> to me.

I'm inclined to call it `struct folio_region` and
free_folio_regions_and_swap_cache(). Then it replaces
free_pages_and_swap_cache() (as per suggestion against your patch set). For now,
I'll implement free_folio_regions_and_swap_cache() similarly to release_pages()
+ free_swap_cache(). Then your patch set just has to reimplement
free_folio_regions_and_swap_cache() using the folio_batch approach. Shout if
that sounds problematic. Otherwise I'll post a new version of my patch set later
today.

> 
>>> +void pfn_range_put(struct pfn_range *range, unsigned int nr)
>>> +{
>>> +	struct folio_batch folios;
>>> +	unsigned int i;
>>> +	struct lruvec *lruvec = NULL;
>>> +	unsigned long flags = 0;
>>> +
>>> +	folio_batch_init(&folios);
>>> +	for (i = 0; i < nr; i++) {
>>> +		struct folio *folio = pfn_folio(range[i].start);
>>> +		unsigned int refs = range[i].end - range[i].start;
>>
>> Don't you need to check for huge zero page like in folios_put()?
>>
>> 		if (is_huge_zero_page(&folio->page))
>> 			continue;
> 
> Maybe?  Can we put the huge zero page in here, or would we filter it
> out earlier?

I'm not sure there is any earlier opportunity? The next earlist would be to
filter it out from the mmugather batch and that feels like confusing
responsibilities too much. So my vote is to put it here.


> 
>>> +	if (lruvec)
>>> +		unlock_page_lruvec_irqrestore(lruvec, flags);
>>> +
>>> +	if (folios.nr) {
>>> +		mem_cgroup_uncharge_folios(&folios);
>>> +		free_unref_folios(&folios);
>>> +	}
>>> +}
>>
>> This still duplicates a lot of the logic in folios_put(), but I see you have an
>> idea in the other thread for improving this situation - I'll reply in the
>> context of that thread.
>>
>> But overall this looks great - thanks for taking the time to put this together -
>> it will integrate nicely with my mmugather series.
>>
> 
> Thanks!  One thing I was wondering; intended to look at today, but didn't
> have time for it -- can we use mmugather to solve how vmscan wants to
> handle batched TLB IPIs for mapped pages, instead of having its own thing?
> 72b252aed506 is the commit to look at.

Yeah I thought about that too. mmugather works well for TLBIing a contiguous
range of VAs (it just maintains a range that gets expanded to the lowest and
highest entries). That works well for things like munmap and mmap_exit where a
virtually contigious block is being zapped. But I'm guessing that for vmscan,
the pages are usually not contiguous? If so, then I think mmugather would
over-TLBI and likely harm perf?





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux