Re: [PATCH 00/40] Memory allocation profiling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 3 May 2023 13:14:57 -0700
Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:

> On Wed, May 3, 2023 at 1:00 PM Tejun Heo <tj@xxxxxxxxxx> wrote:
> >
> > Hello,
> >
> > On Wed, May 03, 2023 at 09:48:55AM -1000, Tejun Heo wrote:  
> > > > If so, that's the idea behind the context capture feature so that we
> > > > can enable it on specific allocations only after we determine there is
> > > > something interesting there. So, with low-cost persistent tracking we
> > > > can determine the suspects and then pay some more to investigate those
> > > > suspects in more detail.  
> > >
> > > Yeah, I was wondering whether it'd be useful to have that configurable so
> > > that it'd be possible for a user to say "I'm okay with the cost, please
> > > track more context per allocation". Given that tracking the immediate caller
> > > is already a huge improvement and narrowing it down from there using
> > > existing tools shouldn't be that difficult, I don't think this is a blocker
> > > in any way. It just bothers me a bit that the code is structured so that
> > > source line is the main abstraction.  
> >
> > Another related question. So, the reason for macro'ing stuff is needed is
> > because you want to print the line directly from kernel, right?  
> 
> The main reason is because we want to inject a code tag at the
> location of the call. If we have a code tag injected at every
> allocation call, then finding the allocation counter (code tag) to
> operate takes no time.

Another consequence is that each source code location gets its own tag.
The compiler can no longer apply common subexpression elimination
(because the tag is different). I have some doubts that there are any
places where CSE could be applied to allocation calls, but in general,
this is one more difference to using _RET_IP_.

Petr T

> > Is that
> > really necessary? Values from __builtin_return_address() can easily be
> > printed out as function+offset from kernel which already gives most of the
> > necessary information for triaging and mapping that back to source line from
> > userspace isn't difficult. Wouldn't using __builtin_return_address() make
> > the whole thing a lot simpler?  
> 
> If we do that we have to associate that address with the allocation
> counter at runtime on the first allocation and look it up on all
> following allocations. That introduces the overhead which we are
> trying to avoid by using macros.
> 
> >
> > Thanks.
> >
> > --
> > tejun  
> 





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux