----- Original Message ----- > On 10/24/2014 10:05 PM, Dave Anderson wrote: > > With the patch I that proposed, you can simplify this part of your patch: > > > > + unsigned int e_flag = (NULL == nd->elf64) ? > > + (nd->elf32)->e_flags : (nd->elf64)->e_flags; > > + int status = e_flag& DUMP_ELF_INCOMPLETE; > > + if (status&& read_ret>= 0) { > > > > to just this: > > if (is_incomplete_dump()&& (read_ret>= 0)) { > > > > But what about compressed kdumps? In the case of DUMP_DH_COMPRESSED_INCOMPLETE > > dumpfiles, will makedumpfile manipulate the bitmaps and the page_desc.offset values > > of all "missing" pages to point to the special "pd_zero" page? Will it be > > impossible to distinguish between legitimate zero-filled pages and > > "missing" pages? > > > > I suppose that if that is true, then the behavior should be the same for > > both ELF and compressed kdumps, which would be to return zero-filled data. > > > > On the other hand... > > > > What's bothersome about automatically returning zero-filled data is that, > > currently, if an ELF vmcore was originally created correctly, but gets truncated > > *after* the fact, for example, when downloading it from a customer site, > > check_dumpfile_size() will report the error. But when data from the truncated > > region gets read, a read error is returned. > > > > If zero-filled data is silently returned, I can imagine that there will be all > > kinds of errors that will occur -- both display/content errors, but it could > > easily cause SIGSEGV's in the crash utility when it tries to use zero-filled > > data thinking that it's legitimate. How would the user even know whether it's > > because the page was truncated or not? > > The patch has been adjusted according to your suggestions. The kdump part has been > fixed. And the warning information is added in both elf part and kdump part. > Thank you for your guiding. > > > The crash utility already has a --zero_excluded flag for returning zero-filled > > memory if by chance a *legitimately* excluded/filtered page is read. It's > > hardly ever used, because those pages should never be required. But I > > wonder whether it could be re-purposed in this case, where by default an error > > would be returned when reading missing pages, and zero-filled only if it is > > explicitly requested with the --zero_excluded command line option? Or would > > that be impossible with DUMP_DH_COMPRESSED_INCOMPLETE compressed kdumps? > > I think it's a good idea and is completed in patch. It can also work in kdump part. > The attachment is the latest patch. > > Thanks > Zhou Wenjian A couple notes and questions re: your latest patch: diff --git a/diskdump.c b/diskdump.c index 1d6f7a5..850b174 100644 --- a/diskdump.c +++ b/diskdump.c @@ -999,6 +999,17 @@ cache_page(physaddr_t paddr) if (FLAT_FORMAT()) { if (!read_flattened_format(dd->dfd, pd.offset, dd->compressed_page, pd.size)) return READ_ERROR; What about FLAT_FORMAT()? Should the (is_incomplete_dump()/pd.offset==0) test below be done before the FLAT_FORMAT() check? + } else if (is_incomplete_dump() && (0 == pd.offset)) { Is this how the makedumpfile patch marks pages that have been truncated, i.e., by creating a page_desc_t that has a pd.offset == 0? + /* + * if --zero_excluded is specified and incomplete flag is set in dump file, + * zero will be used to fill the lost part. + */ + if (!(*diskdump_flags & ZERO_EXCLUDED)) + return READ_ERROR; + error(WARNING, "%s: data may be lost\n" + " pfn:%lld\n\n" + ,pc->dumpfile,pfn); When --zero_excluded is entered on the crash command line, then the user knows exactly what he is doing, and so there should not be any error/warning messages from the read command itself. A message should only be displayed if debug mode is turned on. For example, this is how it is done currently, but only if debug is set to 8: if (CRASHDEBUG(8)) fprintf(fp, "read_diskdump: zero-fill: " "paddr/pfn: %llx/%lx\n", (ulonglong)paddr, pfn); + memset(dd->compressed_page, 0, dd->block_size); } else { if (lseek(dd->dfd, pd.offset, SEEK_SET) == failed) return SEEK_ERROR; diff --git a/netdump.c b/netdump.c index abc85e0..c8f057e 100644 --- a/netdump.c +++ b/netdump.c @@ -608,7 +608,21 @@ read_netdump(int fd, void *bufptr, int cnt, ulong addr, physaddr_t paddr) "offset: %llx\n", (ulonglong)offset); return SEEK_ERROR; } - if (read(nd->ndfd, bufptr, cnt) != cnt) { + /* + * if --zero_excluded is specified and incomplete flag is set in ELF file, + * zero will be used to fill the lost part. + */ + ssize_t read_ret = read(nd->ndfd, bufptr, cnt); + if (read_ret != cnt) { + if (is_incomplete_dump() && (read_ret >= 0 && + *diskdump_flags & ZERO_EXCLUDED)) { + error(WARNING, "%s: data may be lost\n" + " offset:%llx\n\n", + pc->dumpfile, offset); Same thing here... if a user enters --zero_excluded on the command line, he should expect strange behavior as a result. Intermingling a bunch of WARNING messages like the one above will only make it more confusing. + bufptr += read_ret; + bzero(bufptr, cnt - read_ret); + return cnt; + } if (CRASHDEBUG(8)) fprintf(fp, "read_netdump: READ_ERROR: " "offset: %llx\n", (ulonglong)offset); Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility