[LSF/MM ATTEND] 2016: Requests to attend MM-summit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 12 Jan 2016 11:05:45 -0500 "Martin K. Petersen" <martin.petersen@xxxxxxxxxx> wrote:

> The annual Linux Storage, Filesystem and Memory Management Summit for
> 2016 will be held on April 18th and 19th at the Raleigh Marriott City
> center, Raleigh, NC.
> 
[...]
> 
> 2) Requests to attend the summit should be sent to:
> 
> 	lsf-pc@xxxxxxxxxxxxxxxxxxxxxxxxxx
> 
> Please summarise what expertise you will bring to the meeting, and what
> you would like to discuss. Please also tag your email with [LSF/MM
> ATTEND] so there is less chance of it getting lost.

Hi committee,

I would like to participate in LSF/MM.  

I've over the last year optimized the SLAB+SLUB allocators,
specifically by introducing a bulking API.  This work is almost
complete, but I have some more ideas in the MM-area that I would like
to discuss with people.

Specifically I have the following ideas:

1. Speedup *SLUB* with approx 10-20% by using per CPU detached
   freelists for all types of allocations/free.
 * Actually have a prove-of-concept implementation that showed 20% speedup
 * Idea is every page (used-by SLUB) gets a detached freelist
 * The first CPU that alloc the page, owns this detached freelist
 * CPU owning page can do sync free operation on this freelist.
 * SLUB is already highly biased to keep objects on same CPU

2. Bulk alloc without disabling IRQ (SLUB)
 * This is something Real-Time (RT) people will be screaming for,
   once more users of bulk API starts to appear.
 * I think it is doable, but also very challenging to keep performance

3. Faster memset clearing of memory in SLUB
 * Currently netstack clears SKBs right after alloc (2-3% in perf)
 * In SLUB allocator we could clear larger section of memory
   which is significantly faster.
 * Bulk alloc would be the right spot
 * Difficult part is inventing an algorithm for matching contiguous mem,
   which is fast-enough, as the est. time budget is 15-20 cycles.

4. Bulk free from RCU context
 * One major slowdown of using RCU free is, that free will always hit
   SLUB slowpath.  We could change this via bulk free API.
 * This would be a major benefit for the entire kernel performance.
 * The challenge here is getting to know the RCU free code well-enough

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]