From: Petr Tesarik <ptesarik@xxxxxxx> Subject: Re: [PATCH v2 2/3] Generic handling of multi-page exclusions Date: Tue, 8 Apr 2014 08:54:36 +0200 > On Tue, 08 Apr 2014 10:49:07 +0900 (JST) > HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com> wrote: > >> From: Petr Tesarik <ptesarik at suse.cz> >> Subject: [PATCH v2 2/3] Generic handling of multi-page exclusions >> Date: Fri, 4 Apr 2014 19:25:08 +0200 >> >> > When multiple pages are excluded from the dump, store the extents in the >> > mem_map_data structure, and check if anything is still pending on the >> > next invocation of __exclude_unnecessary_pages for the same mem_map. >> > >> > The start PFN of the excluded extent is set to the end of the current >> > cycle (which is equal to the start of the next cycle, see update_cycle), >> > so only the part of the excluded region which falls beyond current cycle >> > buffer is valid. If the excluded region is completely processed in the >> > current cycle, the start PFN is even bigger than the end PFN. That >> > causes nothing to be done at the beginning of the next cycle. >> > >> > There is no check whether the adjusted pfn_start is still within the >> > current cycle. Nothing bad happens if it isn't, because exclude_range() >> > is used again to exclude the remaining part, so even if the excluded >> > region happens to span more than two cycles, the code will still work >> > correctly. >> > >> > Note that clear_bit_on_2nd_bitmap_for_kernel() accepts PFNs outside the >> > current cyclic range. It willreturn FALSE, so such PFNs are not counted. >> > >> > Signed-off-by: Petr Tesarik <ptesarik at suse.cz> >> > --- >> > makedumpfile.c | 47 +++++++++++++++++++++++++++++++++-------------- >> > makedumpfile.h | 7 +++++++ >> > 2 files changed, 40 insertions(+), 14 deletions(-) >> > >> > diff --git a/makedumpfile.c b/makedumpfile.c >> > index 81c687b..9ffb901 100644 >> > --- a/makedumpfile.c >> > +++ b/makedumpfile.c >> > @@ -2385,6 +2385,9 @@ dump_mem_map(unsigned long long pfn_start, >> > mmd->pfn_end = pfn_end; >> > mmd->mem_map = mem_map; >> > >> > + mmd->exclude_pfn_start = 0ULL; >> > + mmd->exclude_pfn_end = 0ULL; >> > + >> > DEBUG_MSG("mem_map (%d)\n", num_mm); >> > DEBUG_MSG(" mem_map : %lx\n", mem_map); >> > DEBUG_MSG(" pfn_start : %llx\n", pfn_start); >> > @@ -4657,6 +4660,21 @@ initialize_2nd_bitmap_cyclic(struct cycle *cycle) >> > return TRUE; >> > } >> > >> > +static void >> > +exclude_range(unsigned long long *counter, struct mem_map_data *mmd, >> > + unsigned long long pfn, unsigned long long endpfn, struct cycle *cycle) >> > +{ >> > + while (pfn < endpfn) { >> > + if (clear_bit_on_2nd_bitmap_for_kernel(pfn, cycle)) >> > + (*counter)++; >> > + ++pfn; >> > + } >> >> Here endpfn is pfn + nr_pages in __exclude_unnecessary_pages(), and >> the pfn could be cycle->end_pfn <= pfn < endpfn. >> >> while (pfn < MIN(endpfn, cycle->end_pfn) { >> if (clear_bit_on_2nd_bitmap_for_kernel(pfn, cycle)) >> (*counter)++; >> ++pfn; >> } > > This is a non-issue: clear_bitmap_cyclic() checks the extents, and I > even mentioned it in the commit message. All right, we can save some > loop iterations by moving the check out of the loop body... > Ah, OK, I interpreted somehow your code like: + mmd->exclude_pfn_start = cycle ? endpfn: ULONGLONG_MAX; So there's no logically wrong point. BTW, hmm, the behaviour of clear_bit_on_2nd_bitmap_for_kernel() is not what I expect, that is, I don't expect sanity check in set_bitmap_cyclic(), which should have been removed by clean up we did some times ago. int set_bitmap_cyclic(char *bitmap, unsigned long long pfn, int val, struct cycle *cycle) { int byte, bit; if (pfn < cycle->start_pfn || cycle->end_pfn <= pfn) <-- this return FALSE; The code some time ago changes region of cycles in many places and it was difficult to figure out. So, in the clean up, we introduced struct cycle and for_each_cycle() to obtain invariance that cycle is not changed under for_each_cycle(). So, could you fix also this if possible? >> > + >> > + mmd->exclude_pfn_start = cycle ? cycle->end_pfn : ULONGLONG_MAX; >> >> When does cycle become NULL? > > When __exclude_unnecessary_pages() is called from > exclude_unnecessary_pages, i.e. non-cyclic. > >> Along with the above point, >> >> mmd->exclude_pfn_start = MIN(endpfn, cycle->end_pfn); >> >> and then we can continue the processing in the next cycle. > > Again, this is a non-issue. These stored extents are validated before > use in __exclude_unnecessary_pages. Why should I check them twice? > And by the way, this is also mentioned in the commit message. > >> > + mmd->exclude_pfn_end = endpfn; >> > + mmd->exclude_pfn_counter = counter; >> > +} >> > + >> > int >> > __exclude_unnecessary_pages(struct mem_map_data *mmd, struct cycle *cycle) >> > { >> > @@ -4671,6 +4689,18 @@ __exclude_unnecessary_pages(struct mem_map_data *mmd, struct cycle *cycle) >> > unsigned long flags, mapping, private = 0; >> > >> > /* >> > + * If a multi-page exclusion is pending, do it first >> > + */ >> > + if (mmd->exclude_pfn_start < mmd->exclude_pfn_end) { >> > + exclude_range(mmd->exclude_pfn_counter, mmd, >> > + mmd->exclude_pfn_start, mmd->exclude_pfn_end, >> > + cycle); >> > + >> > + mem_map += (mmd->exclude_pfn_end - pfn_start) * SIZE(page); >> > + pfn_start = mmd->exclude_pfn_end; >> > + } >> > + >> > + /* >> > * Refresh the buffer of struct page, when changing mem_map. >> > */ >> > pfn_read_start = ULONGLONG_MAX; >> > @@ -4734,21 +4764,10 @@ __exclude_unnecessary_pages(struct mem_map_data *mmd, struct cycle *cycle) >> > if ((info->dump_level & DL_EXCLUDE_FREE) >> > && info->page_is_buddy >> > && info->page_is_buddy(flags, _mapcount, private, _count)) { >> > - int i, nr_pages = 1 << private; >> > + int nr_pages = 1 << private; >> > + >> > + exclude_range(&pfn_free, mmd, pfn, pfn + nr_pages, cycle); >> > >> > - for (i = 0; i < nr_pages; ++i) { >> > - /* >> > - * According to combination of >> > - * MAX_ORDER and size of cyclic >> > - * buffer, this clearing bit operation >> > - * can overrun the cyclic buffer. >> > - * >> > - * See check_cyclic_buffer_overrun() >> > - * for the detail. >> > - */ >> > - if (clear_bit_on_2nd_bitmap_for_kernel((pfn + i), cycle)) >> > - pfn_free++; >> > - } >> > pfn += nr_pages - 1; >> > mem_map += (nr_pages - 1) * SIZE(page); >> > } >> > diff --git a/makedumpfile.h b/makedumpfile.h >> > index 951ed1b..dfad569 100644 >> > --- a/makedumpfile.h >> > +++ b/makedumpfile.h >> > @@ -816,6 +816,13 @@ struct mem_map_data { >> > unsigned long long pfn_start; >> > unsigned long long pfn_end; >> > unsigned long mem_map; >> > + >> > + /* >> > + * for excluding multi-page regions >> > + */ >> > + unsigned long exclude_pfn_start; >> > + unsigned long exclude_pfn_end; >> >> unsigned long long exclude_pfn_start; >> unsigned long long exclude_pfn_end; >> >> The integers representing page frame numbers need to be defined as >> unsigned long long for architectures where physical address can have >> 64-bit length but unsigned long has 32-bit only, such as x86 PAE. > > Ouch. My mistake. I thought I covered all places, but somehow I > missed this one. I'm going to post a fixed series. > >> Kumagai-san, I saw this sometimes in the past. How about introducing >> specific abstract type for page frame number like below? >> >> typedef unsigned long long pfn_t; >> >> maybe with some prefix. I think this also helps code readability >> because unsigned long long is too long. >> >> > + unsigned long long *exclude_pfn_counter; >> > }; >> >> Also, it seems to me better to introduce a new type for this >> processing rather than extending existing code. struct mem_map_data is >> not specific for the excluding processing. > > Kind of agreed. OTOH it will most likely be embedded in struct > mem_map_data anyway, because exactly one such object per mm is needed. > > Petr T I don't understand well. It seems to me a single object is enough. Is it possible to nr_pages cover multiple mm's? Thanks. HATAYAMA, Daisuke