a case for a destructor for slub: mm_struct

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm looking for someone(tm) willing to implement a destructor for slub.

Currently SLUB only supports a constructor, a callback to use when
first creating an object, but there is no matching callback for
getting rid of it.

The pair would come in handy when a frequently allocated and freed
object performs the same expensive work each time.

The specific usage I have in mind is mm_struct -- it gets allocated on
both each fork and exec and suffers global serialization several
times.

The primary thing I'm looking to handle this way is cid and percpu
counter allocation, both going to down to the percpu allocator which
only has a global lock. The problem is exacerbated as it happens
back-to-back, so that's 4 acquires per lifetime cycle (alloc and
free).

There is other expensive work which can also be modified this way.

I recognize something like this would pose a tradeoff in terms of
memory usage, but I don't believe it's a big deal. If you have a
mm_struct hanging out, you are going to need to have the percpu memory
up for grabs to make any use of it anyway. Granted, there may be
spurious mm_struct's hanging out and eating pcpu resources. Something
can be added to reclaim those by the pcpu allocator.

So that's it for making the case, as for the APIs, I think it would be
best if both dtor and ctor accepted a batch of objects to operate on,
but that's a lot of extra churn due to pre-existing ctor users.

ACHTUNG: I think this particular usage would still want some buy in
from the mm folk and at least Dennis (the percpu allocator
maintainer), but one has to start somewhere. There were 2 different
patchsets posted to move rss counters away from the current pcpu
scheme, but both had different tradeoffs and ultimately died off.

Should someone(tm) commit to sorting this out, I'll handle the percpu
thing. There are some other tweaks warranted here (e.g., depessimizing
the rss counter validation loop at exit).

So what do you think?

In order to bench yourself,  you can grab code from here:
http://apollo.backplane.com/DFlyMisc/doexec.c

$ cc -static -O2 -o static-doexec doexec.c
$ ./static-doexec $(nproc)

I check spinlock problems with: bpftrace -e
'kprobe:__pv_queued_spin_lock_slowpath { @[kstack()] = count(); }'
-- 
Mateusz Guzik <mjguzik gmail.com>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux