----- Original Message ----- > Hi Dave, > > On 12/17/12 11:23, Dave Anderson wrote: > >>> Right -- I would never expect error() to be called while inside > >>> an open_tmpfile() operation. Normally the behind-the-scenes data > >>> is parsed, and if anything is to be displayed while open_tmpfile() > >>> is still in play, it would be fprint()'ed using pc->saved_fp. > >> > >> I think the aesthetically pleasing solution is an "i_am_playing_with_tmpfile()" > >> call that says it isn't closed and crash functions shouldn't be using it. > >> Plus a parallel "i_am_done_with_tmpfile()" that gets implied by "close_tmpfile()". > >> I can supply a patch, if you like. Probably with less verbose function names. > > > > If pc->tmpfile is non-NULL, then open_tmpfile() is in use. What would be > > the purpose of the extra functions? > > It would be to allow the client code that is processing that temp file to emit > warning/info messages without disrupting the reading of that file pointer. > To me, that doesn't seem unreasonable. You run some code that emits output > to a temp file and you reprocess those data. You surely do not want such > messages showing up in the file you are re-processing. And you cannot > call close_tmpfile() because it calls ftruncate(). > > So, what is your recommendation for how to reprocess diverted output > wherein you might occasionally want to say something during that > reprocessing? > > Three solutions come to mind: > > 1. Juggle file pointers before and after the __error() function call > (please say, "No.") No. > 2. Create my own temporary file and fiddle the global "fp" and "pc" state so it > gets used while I am gathering data and crash code doesn't know about it later. > (I insist the answer must be, "No." because there is too much fiddling with > intricate crash state.) No. > 3. These two functions that I am suggesting: > > void > resume_tmpfile(void) > { > int ret ATTRIBUTE_UNUSED; > > if (pc->tmpfile) > error(FATAL, "recursive temporary file usage\n"); > > if (!pc->tmp_fp) > error(FATAL, "temporary file not ready\n"); > > rewind(pc->tmp_fp); > pc->tmpfile = pc->tmp_fp; > pc->saved_fp = fp; > fp = pc->tmpfile; > } > > void > sequester_tmpfile(void) > { > int ret ATTRIBUTE_UNUSED; > > if (pc->tmpfile) { > fflush(pc->tmpfile); > rewind(pc->tmpfile); > pc->tmpfile = NULL; > fp = pc->saved_fp; > } else > error(FATAL, "trying to sequester an unopened temporary file\n"); > } And no... When open_tmpfile() is in play and you want to print something, you can always use fprintf(pc->saved_fp, ...) as is done everywhere now. That being said, if you truly desire to use error() during an open_tmpfile() operation, then that anomoly should be handled in the error() function. So, if error() is called during open_tmpfile(), i.e., then the message should be displayed as it is done now, which is to pc->stdpipe (i.e., the current more/less scroller if it is in effect), or to stdout if not: if (pc->stdpipe) { fprintf(pc->stdpipe, "%s%s%s %s%s", new_line ? "\n" : "", type == CONT ? spacebuf : pc->curcmd, type == CONT ? " " : ":", type == WARNING ? "WARNING: " : type == NOTE ? "NOTE: " : "", buf); fflush(pc->stdpipe); } else { fprintf(stdout, "%s%s%s %s%s", new_line || end_of_line ? "\n" : "", type == WARNING ? "WARNING" : type == NOTE ? "NOTE" : type == CONT ? spacebuf : pc->curcmd, type == CONT ? " " : ":", buf, end_of_line ? "\n" : ""); fflush(stdout); } and if the output is currently being redirected to a file or to a pipe, then it is also issued to those end-points here: if ((fp != stdout) && (fp != pc->stdpipe)) { fprintf(fp, "%s%s%s %s", new_line ? "\n" : "", type == WARNING ? "WARNING" : type == NOTE ? "NOTE" : type == CONT ? spacebuf : pc->curcmd, type == CONT ? " " : ":", buf); fflush(fp); } It's that "duplication" above that you're seeing. And I am simply suggesting that the if statement above should be: if ((fp != stdout) && (fp != pc->stdpipe) && (fp != pc->tmpfile)) { because you obviously don't want the message intermingled with your open_tmpfile() output. > > I sequester the file after doing the data gathering and resume it > after I am done reprocessing it. It might be worth putting in a little jig > to ensure that open/close_tmpfile work reasonably, too. (I would guess > that either would cancel the sequestration.) > > >>> I'm not sure, other than it doesn't seem to be able to find ffffea001bb1d1e8 > >> > >> I was able to figure that out. I also printed out the "kmem -v" table and > >> sorted the result. The result with "kmem -n" > >> > >> [...] > >> 66 ffff88087fffa420 ffffea0000000000 ffffea0007380000 2162688 > >> 67 ffff88087fffa430 ffffea0000000000 ffffea0007540000 2195456 > >> 132608 ffff88083c9bdb98 ffff88083c9bdd98 ffff8840e49bdd98 4345298944 > >> 132609 ffff88083c9bdba8 ffff88083c9796c0 ffff8840e4b396c0 4345331712 > >> ;...] > >> > >> viz. it ain't there. Which is quite interesting, because if the lustre > >> cluster file system structure "cfs_trace_data" actually pointed off into > >> unmapped memory, it would have fallen over long, long before the point > >> where it did fall over. > > > > I don't see the vmemmap range in the "kmem -v" output. It is mapped > > kernel memory, but AFAIK it's not kept in the kernel's "vmlist" list. > > Do you see that range in your "kmem -v" output? > > Also no. "kmem -v" and "kmem -n" both show the same memory mappings > (as best as _my_ memory serves, that is. For certain, neither has a mapping > for 0xffffea001bb1d1e8.) > > > OK so you say you cannot get the mappings for it, but what > > does "vtop 0xffffea001bb1d1e8" show? > > This: > > > crash> vtop 0xffffea001bb1d1e8 > > VIRTUAL PHYSICAL > > ffffea001bb1d1e8 879b1d1e8 > > > > PML4 DIRECTORY: ffffffff817e7000 > > PAGE DIRECTORY: 87fdf7067 > > PUD: 87fdf7000 => 87fdf6067 > > PMD: 87fdf66e8 => 8000000879a001e3 > > PAGE: 879a00000 (2MB) > > > > PTE PHYSICAL FLAGS > > 8000000879a001e3 879a00000 (PRESENT|RW|ACCESSED|DIRTY|PSE|GLOBAL|NX) > > But given: > > > Sorry -- that's irrelevant. You want to access the physical > > memory that the odd vmemmap page address references (not the > > physical page behind the page structure itself). > > Exactly right. I need to be able to see the binary bits for that page so I can > pull them in and write them back out to a file of just those bits. From there, > we'll be formatting a text file showing the lustre trace log. > > Thank you so much! Regards, Bruce Right... seems like it should be such a simple thing to do... :-( I don't understand what's going on, but I'm presuming that even if the vmemmap-type address doesn't fit into the "advertised" vmemmap range, that the kernel's __page_to_pfn() macro should still work to get the pfn represented by the page: #elif defined(CONFIG_SPARSEMEM) /* * Note: section's mem_map is encorded to reflect its start_pfn. * section[i].section_mem_map == mem_map's address - start_pfn; */ #define __page_to_pfn(pg) \ ({ const struct page *__pg = (pg); \ int __sec = page_to_section(__pg); \ (unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec))); \ }) Maybe you could play around with emulating that macro w/crash, and see what comes up? Dave -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility