Hi Dave, On 12/17/12 11:23, Dave Anderson wrote: >>> Right -- I would never expect error() to be called while inside >>> an open_tmpfile() operation. Normally the behind-the-scenes data >>> is parsed, and if anything is to be displayed while open_tmpfile() >>> is still in play, it would be fprint()'ed using pc->saved_fp. >> >> I think the aesthetically pleasing solution is an "i_am_playing_with_tmpfile()" >> call that says it isn't closed and crash functions shouldn't be using it. >> Plus a parallel "i_am_done_with_tmpfile()" that gets implied by "close_tmpfile()". >> I can supply a patch, if you like. Probably with less verbose function names. > > If pc->tmpfile is non-NULL, then open_tmpfile() is in use. What would be > the purpose of the extra functions? It would be to allow the client code that is processing that temp file to emit warning/info messages without disrupting the reading of that file pointer. To me, that doesn't seem unreasonable. You run some code that emits output to a temp file and you reprocess those data. You surely do not want such messages showing up in the file you are re-processing. And you cannot call close_tmpfile() because it calls ftruncate(). So, what is your recommendation for how to reprocess diverted output wherein you might occasionally want to say something during that reprocessing? Three solutions come to mind: 1. Juggle file pointers before and after the __error() function call (please say, "No.") 2. Create my own temporary file and fiddle the global "fp" and "pc" state so it gets used while I am gathering data and crash code doesn't know about it later. (I insist the answer must be, "No." because there is too much fiddling with intricate crash state.) 3. These two functions that I am suggesting: void resume_tmpfile(void) { int ret ATTRIBUTE_UNUSED; if (pc->tmpfile) error(FATAL, "recursive temporary file usage\n"); if (!pc->tmp_fp) error(FATAL, "temporary file not ready\n"); rewind(pc->tmp_fp); pc->tmpfile = pc->tmp_fp; pc->saved_fp = fp; fp = pc->tmpfile; } void sequester_tmpfile(void) { int ret ATTRIBUTE_UNUSED; if (pc->tmpfile) { fflush(pc->tmpfile); rewind(pc->tmpfile); pc->tmpfile = NULL; fp = pc->saved_fp; } else error(FATAL, "trying to sequester an unopened temporary file\n"); } I sequester the file after doing the data gathering and resume it after I am done reprocessing it. It might be worth putting in a little jig to ensure that open/close_tmpfile work reasonably, too. (I would guess that either would cancel the sequestration.) >>> I'm not sure, other than it doesn't seem to be able to find >>> ffffea001bb1d1e8 >> >> I was able to figure that out. I also printed out the "kmem -v" table and >> sorted the result. The result with "kmem -n" >> >> [...] >> 66 ffff88087fffa420 ffffea0000000000 ffffea0007380000 2162688 >> 67 ffff88087fffa430 ffffea0000000000 ffffea0007540000 2195456 >> 132608 ffff88083c9bdb98 ffff88083c9bdd98 ffff8840e49bdd98 4345298944 >> 132609 ffff88083c9bdba8 ffff88083c9796c0 ffff8840e4b396c0 4345331712 >> ;...] >> >> viz. it ain't there. Which is quite interesting, because if the lustre >> cluster file system structure "cfs_trace_data" actually pointed off into >> unmapped memory, it would have fallen over long, long before the point >> where it did fall over. > > I don't see the vmemmap range in the "kmem -v" output. It is mapped > kernel memory, but AFAIK it's not kept in the kernel's "vmlist" list. > Do you see that range in your "kmem -v" output? Also no. "kmem -v" and "kmem -n" both show the same memory mappings (as best as _my_ memory serves, that is. For certain, neither has a mapping for 0xffffea001bb1d1e8.) > OK so you say you cannot get the mappings for it, but what > does "vtop 0xffffea001bb1d1e8" show? This: > crash> vtop 0xffffea001bb1d1e8 > VIRTUAL PHYSICAL > ffffea001bb1d1e8 879b1d1e8 > > PML4 DIRECTORY: ffffffff817e7000 > PAGE DIRECTORY: 87fdf7067 > PUD: 87fdf7000 => 87fdf6067 > PMD: 87fdf66e8 => 8000000879a001e3 > PAGE: 879a00000 (2MB) > > PTE PHYSICAL FLAGS > 8000000879a001e3 879a00000 (PRESENT|RW|ACCESSED|DIRTY|PSE|GLOBAL|NX) But given: > Sorry -- that's irrelevant. You want to access the physical > memory that the odd vmemmap page address references (not the > physical page behind the page structure itself). Exactly right. I need to be able to see the binary bits for that page so I can pull them in and write them back out to a file of just those bits. From there, we'll be formatting a text file showing the lustre trace log. Thank you so much! Regards, Bruce -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility