Jay Lan wrote: > Ken'ichi Ohmichi wrote: >> Hi Dave, Jay, >> >> Dave Anderson wrote: >>> We just ran into a similar problem using an older version of makedumpfile, >>> but looking at the latest makedumpfile code, it's seems that you could >>> run into the same problem. >>> >>> In exclude_unnecessary_pages(), if a physical page is in a memory >>> hole, then it skips the page and continues. In our case, that happened, >>> but when it started up again, the next legitimate pfn was well beyond >>> the previously-read cache of 512 pages. But since the new legit page >>> wasn't modulo-512, it didn't refresh the page cache, and it ended up >>> using stale page data (page->flags) and ended up excluding legitimate >>> pages: >>> >>> for (; pfn < mmd->pfn_end; >>> pfn++, mem_map += SIZE(page), >>> paddr += info->page_size) { >>> >>> /* >>> * Exclude the memory hole. >>> */ >>> if (!is_in_segs(paddr)) >>> continue; >>> >>> if ((pfn % PGMM_CACHED) == 0) { >>> if (pfn + PGMM_CACHED < mmd->pfn_end) >>> pfn_mm = PGMM_CACHED; >>> else >>> pfn_mm = mmd->pfn_end - pfn; >>> if (!readmem(VADDR, mem_map, page_cache, >>> SIZE(page) * pfn_mm)) >>> goto out; >>> } >>> >>> We fixed it by doing something like this: >>> >>> if (!is_in_segs(paddr)) { >>> reset_cache = 1; >>> continue; >>> } >>> >>> if (((pfn % PGMM_CACHED) == 0) || reset_cache) { >>> reset_cache = 0; >>> ... >> Great, you are right. >> Thank you for fixing it :-) >> >> Jay, could you try Dave's fixing like the attached patch ? > > Yes. I applied your version of Dave's patch and tried again. > It failed at a different pfn f600315: > > > a4700rac:/mnt/sda9/diskdump # rm dump.cd31; /var/tmp/jlan/makedumpfile > -cd31 -e 0xe0000f60031502f0 -x vmlinux.3 vmcore-cp.3 dump.cd31 > Excluding unnecessary pages : [ 45 %] > pfn=f600315 flags=3c000000001026c > > PAGE(vaddr:e0000f60031502f0, pfn:f600315) is excluded as CACHE PAGE. > > Copying data : [100 %] > > The dumpfile is saved to dump.cd31. > > makedumpfile Completed. > > > Note the flags of pfn f600315. Crash checked on the pfn using the > vmcore-cp.3 and showed different flags: > > crash> kmem -p f60031502f0 > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > a07ffffc45d00498 f6003150000 0 0 1 3c0000000000400 > crash> > I ran a testing on a 2-cpu machine. The legitimate page got excluded is: PAGE(vaddr:e00000300313fb70, pfn:300313) is excluded as CACHE PAGE Values of some variables in the routine on processing that page are: pfn=300313 flags=3026c, page.flags=0 page_cache=0x6000000000033f60, pcache=0x6000000000037b88 Hope these data help. Thanks, - jay > >> >> Thanks >> Ken'ichi Ohmichi >> >