-----Original Message----- > On Mon, Jun 21, 2021 at 06:02:51AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote: > > -----Original Message----- > > > + > > > + if (read_pd(dd->dfd, offset, &pd)) { > > > + /* > > > + * Truncated page descriptor at most > > > + * references full page. > > > + */ > > > + expected_size += block_size; > > > + goto next; > > > + } > > > + > > > + if (pd.offset == 0) { > > > + if (!first_empty_pd) > > > + first_empty_pd = page_idx; > > > + /* > > > + * Incomplete pages at most use the > > > + * whole page. > > > + */ > > > + expected_size += block_size; > > > + } else if (!pd.flags) { > > > + /* > > > + * Zero page has no compression flags. > > > + */ > > > > Non-compressed page also has no compression flags. > > > > So something like (pd.offset == dd->data_offset) is needed to determine > > whether the page is zero page? although it's not exact without excluding > > zero pages. > > > > Hi Kazu, > > Yes, you're right. Thanks for spotting it. > > I've added a code path for a case when zero pages are not it's excluded. > It's a no brainer. However, I've got some issues figuring out whether a > page descriptor references zero page when we start to differentiate > zero/non-zero uncompressed pages and calculate expected size accordingly. > > dd->data_offset points to the beginning of page descriptors and it > doesn't help to find zero page: > > data_offset -> pd for pfn 0 > pd for pfn 1 > ... > pd for pfn N > ... <-some gap ??? > zero page > pfn 0 Oh, you're right, I misread something. There should be no gap between pd for pfn N and zero page, but anyway we cannot get the number of pds in advance without counting the bitmap.. > > The essense of what I've got so far is the following: > > if (page_idx == 0) > first_page_offset = pd.offset; > > [...] > } else if (!pd.flags) { > /* The page is not compressed */ > if (dd->flags & ZERO_EXCLUDED) { > /* > * Zero page is located just before the first > * dumpable pfn. The following pages are non-zero. > */ > if (pd.offset >= first_page_offset) > expected_size += block_size; > } else { > expected_size += block_size; > } > } else if (pd.flags) { > [...] > > As you can see it would work but only if pd for pfn 0 doesn't reference > zero page. Do you have any idea how can we reliably determine zero page > offset? It can be that pd for pfn0 points to zero page, especially considering makedumpfile --split option. What I thought of are: 1. Two-pass check Count the number of dumpable pages first, calculate the offset of zero page, then check pd.offset. 2. Online adjustment, i.e. if (pd.offset > first_page_offset) // the first page is always regarded as zero page expected_size += block_size; else if (pd.offset < first_page_offset) { first_page_offset = pd.offset // should be zero page expected_size += block_size; // for the first page missed } but this is a bit tricky and confusing. Also, there is need to add one page size for zero page if ZERO_EXCLUDED either way. Personally I think that the former is better, but maybe depending on how much time is consumed with large dumps. Thanks, Kazu -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/crash-utility