Re: [PATCH 00/05] robust per_cpu allocation for modules

Steven Rostedt <rostedt@xxxxxxxxxxx> · Sun, 16 Apr 2006 12:06:21 -0400

On Sun, 2006-04-16 at 17:06 +1000, Nick Piggin wrote:
> Steven Rostedt wrote:
> > On Sun, 16 Apr 2006, Nick Piggin wrote:
> 
> >>Why is your module using so much per-cpu memory, anyway?
> > 
> > 
> > Wasn't my module anyway. The problem appeared in the -rt patch set, when
> > tracing was turned on.  Some module was affected, and grew it's per_cpu
> > size by quite a bit. In fact we had to increase PERCPU_ENOUGH_ROOM by up
> > to something like 300K.
> 
> Well that's easy then, just configure PERCPU_ENOUGH_ROOM to be larger
> when tracing is on in the -rt patchset? Or use alloc_percpu for the
> tracing data?
> 

Yeah, we already know this.  The -rt patch was what showed the problem,
not the reason I was writing these patches.

> >>I don't think it would have been hard for the original author to make
> >>it robust... just not both fast and robust. PERCPU_ENOUGH_ROOM seems
> >>like an ugly hack at first glance, but I'm fairly sure it was a result
> >>of design choices.
> > 
> > Yeah, and I discovered the reasons for those choices as I worked on this.
> > I've put a little more thought into this and still think there's a
> > solution to not slow things down.
> > 
> > Since the per_cpu_offset section is still smaller than the
> > PERCPU_ENOUGH_ROOM and robust, I could still copy it into a per cpu memory
> > field, and even add the __per_cpu_offset to it.  This would still save
> > quite a bit of space.
> 
> Well I don't think making it per-cpu would help much (presumably it
> is not going to be written to very frequently) -- I guess it would
> be a small advantage on NUMA. The main problem is the extra load in
> the fastpath.
> 
> You can't start the next load until the results of the first come
> back.

Yep, you're right here, and it bothers me too that this slows down
performance.

> 
> > So now I'm asking for advice on some ideas that can be a work around to
> > keep the robustness and speed.
> > 
> > Is there a way (for archs that support it) to allocate memory in a per cpu
> > manner. So each CPU would have its own variable table in the memory that
> > is best of it.  Then have a field (like the pda in x86_64) to point to
> > this section, and use the linker offsets to index and find the per_cpu
> > variables.
> > 
> > So this solution still has one more redirection than the current solution
> > (per_cpu_offset__##var -> __per_cpu_offset -> actual_var where as the
> > current solution is __per_cpu_offset -> actual_var), but all the loads
> > would be done from memory that would only be specified for a particular
> > CPU.
> > 
> > The generic case would still be the same as the patches I already sent,
> > but the archs that can support it, can have something like the above.
> > 
> > Would something like that be acceptible?
> 
> I still don't understand what the justification is for slowing down
> this critical bit of infrastructure for something that is only a
> problem in the -rt patchset, and even then only a problem when tracing
> is enabled.
> 

It's because I'm anal retentive :-)

I noticed that the current solution is somewhat a hack, and thought
maybe it could be done cleaner.  Perhaps I'm wrong and the hack _is_ the
best solution, but it doesn't hurt in trying to improve it.  Or the very
least, prove that the current solution is the way to go.

I'm not trying to solve an issue with the -rt patch and tracing, I'm
just trying to make Linux a little more efficient in saving space. And
you may be right that we cant do that without hurting performance, and
thus we keep things as is.  But I don't want to give up without a fight
and miss something that can solve all this and keep Linux the best OS on
the market! (not to say that it isn't even with the current solution)

-- Steve