Re: kmem: WARNING: cannot find mem_map page for address

Dave Anderson <anderson@xxxxxxxxxx> · Mon, 17 Dec 2012 14:23:14 -0500 (EST)

----- Original Message -----
> On 12/17/12 09:03, Dave Anderson wrote:
> >> That also removed the mysterious problem of having a duplicate
> >> error
> >> message show up on the console.
> > 
> > OK, if that works for you...
> 
> In the sense of being able to do my day job, yes.
> In the sense of aesthetics, not so much.
> 
> > Right -- I would never expect error() to be called while inside
> > an open_tmpfile() operation.  Normally the behind-the-scenes data
> > is parsed, and if anything is to be displayed while open_tmpfile()
> > is still in play, it would be fprint()'ed using pc->saved_fp.
> 
> I think the aesthetically pleasing solution is an "i_am_playing_with_tmpfile()"
> call that says it isn't closed and crash functions shouldn't be using it.
> Plus a parallel "i_am_done_with_tmpfile()" that gets implied by "close_tmpfile()".
> I can supply a patch, if you like.  Probably with less verbose function names.

If pc->tmpfile is non-NULL, then open_tmpfile() is in use.  What would be
the purpose of the extra functions?

> 
> >>> crash> gdb set $tp = (struct cfs_trace_page *)0xffff8807fb590740
> >>> crash> p $tp->page
> >>> $7 = (cfs_page_t *) 0xffffea001bb1d1e8
> >>> crash> p *$tp->page
> >>> $8 = {
> >>>   flags = 144115188075855872,
> >>>   _count = {
> >>>     counter = 1
> >>>   },
> >>> [...]
> >>>   lru = {
> >>>     next = 0xdead000000100100,
> >>>     prev = 0xdead000000200200
> >>>   }
> >>> }
> >>> crash> kmem 0xffffea001bb1d1e8
> >>> kmem: WARNING: cannot find mem_map page for address:
> >>> ffffea001bb1d1e8
> >>> 879b1d1e8: kernel virtual address not found in mem map
> >>
> >> So I can print out the page_t structure (renamed as cfs_page_t in
> >> Lustre)
> >> at address 0xffff8807fb590740, but when I try to get kmem
> >> information about
> >> it, it cannot find the  page.  What am I missing?
> >>
> >> Thanks for hints/pointers!  Regards, Bruce
> > 
> > I'm not sure, other than it doesn't seem to be able to find
> > ffffea001bb1d1e8
> 
> I was able to figure that out.  I also printed out the "kmem -v" table and
> sorted the result.  The result with "kmem -n"
> 
> [...]
> 66  ffff88087fffa420  ffffea0000000000  ffffea0007380000  2162688
> 67  ffff88087fffa430  ffffea0000000000  ffffea0007540000  2195456
> 132608  ffff88083c9bdb98  ffff88083c9bdd98  ffff8840e49bdd98  4345298944
> 132609  ffff88083c9bdba8  ffff88083c9796c0  ffff8840e4b396c0  4345331712
> ;...]
> 
> viz. it ain't there.  Which is quite interesting, because if the lustre
> cluster file system structure "cfs_trace_data" actually pointed off into
> unmapped memory, it would have fallen over long, long before the point
> where it did fall over.

I don't see the vmemmap range in the "kmem -v" output.  It is mapped
kernel memory, but AFAIK it's not kept in the kernel's "vmlist" list.
Do you see that range in your "kmem -v" output? 

> >>>> ffff8807fb590740
> >>>> struct cfs_trace_page {
> >>>>  page = 0xffffea001bb1d1e8,  <<<<==== address in question
> >>>>  linkage = {
> >>>>    next = 0xffff8807fb590ee8,
> >>>>    prev = 0xffff880824e3d810
> >>>>  },
> 
> It seems like it is both there and not there, so I am misunderstanding something.
> For sure.
> 
> > crash> whatis $tp->page
> > cfs_page_t *
> > crash> p $tp->page
> > $8 = (cfs_page_t *) 0xffffea001bb1d1e8
> > crash> p *$tp->page
> > $9 = {
> >   flags = 0x200000000000000,
> >   _count = {
> >     counter = 1
> >   },
> >   {
> >     _mapcount = {
> >       counter = -1
> >     },
> >     {
> >       inuse = 65535,
> >       objects = 65535
> >     }
> >   },
> >   {
> >     {
> >       private = 0,
> >       mapping = 0x0
> >     },
> >     ptl = {
> >       {
> >         rlock = {
> >           raw_lock = {
> >             slock = 0
> >           }
> >         }
> >       }
> >     },
> >     slab = 0x0,
> >     first_page = 0x0
> >   },
> >   {
> >     index = 0,
> >     freelist = 0x0,
> >     pfmemalloc = false
> >   },
> >   lru = {
> >     next = 0xdead000000100100,
> >     prev = 0xdead000000200200
> >   }
> > }
> 
> So clearly, I am able to read cfs_page_t data at that address.
> But I cannot get the mappings for it, and neither can my lustre
> extensions.  (We are trying to extract trace data from an in-kernel
> circular buffer that is 5,100 pages in size (20 Meg).)
> 
> Thank you any help at all!  Regards, Bruce

OK so you say you cannot get the mappings for it, but what 
does "vtop 0xffffea001bb1d1e8" show?

Dave

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility