Re: [RFC][PATCH 0/7] re-shrink 'struct page' when SLUB is on.

Dave Hansen <dave@xxxxxxxx> · Wed, 18 Dec 2013 16:24:15 -0800

On 12/17/2013 07:17 AM, Christoph Lameter wrote:
> On Mon, 16 Dec 2013, Dave Hansen wrote:
> 
>> I'll do some testing and see if I can coax out any delta from the
>> optimization myself.  Christoph went to a lot of trouble to put this
>> together, so I assumed that he had a really good reason, although the
>> changelogs don't really mention any.
> 
> The cmpxchg on the struct page avoids disabling interrupts etc and
> therefore simplifies the code significantly.
> 
>> I honestly can't imagine that a cmpxchg16 is going to be *THAT* much
>> cheaper than a per-page spinlock.  The contended case of the cmpxchg is
>> way more expensive than spinlock contention for sure.
> 
> Make sure slub does not set __CMPXCHG_DOUBLE in the kmem_cache flags
> and it will fall back to spinlocks if you want to do a comparison. Most
> non x86 arches will use that fallback code.

I did four tests.  The first workload allocs a bunch of stuff, then
frees it all with both the cmpxchg-enabled 64-byte struct page and the
48-byte one that is supposed to use a spinlock.  I confirmed the 'struct
page' size in both cases by looking at dmesg.

Essentially, I see no worthwhile benefit from using the double-cmpxchg
over the spinlock.  In fact, the increased cache footprint makes it
*substantially* worse when doing a tight loop.

Unless somebody can find some holes in this, I think we have no choice
but to unset the HAVE_ALIGNED_STRUCT_PAGE config option and revert using
the cmpxchg, at least for now.

Kernel config:
https://www.sr71.net/~dave/intel/config-20131218-structpagesize
System was an 80-core "Westmere" Xeon

I suspect that the original data:

> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8a5ec0b

are invalid because the data there were not done with the increased
'struct page' padding.

---------------------------

First test:

	for (i = 0; i < kmalloc_iterations; i++)
        	gunk[i] = kmalloc(kmalloc_size, GFP_KERNEL);
	for (i = 0; i < kmalloc_iterations; i++)
		kfree(gunk[i]);

All units are all in nanoseconds, lower is better.

		size of 'struct page':
kmalloc size	64-byte 48-byte
8		98.2	105.7
32		123.7	125.8
128		293.9	289.9
256		572.4	577.9
1024		621.0	639.3
4096		733.3	746.7
8192		968.3	948.6

As you can see, it's mostly a wash.  The 64-byte one looks to have a
~8ns advantage, but any advantage disappears in to the noise on the
other sizes.

---------------------------

Second test did the same 'struct page sizes', but instead did a
kmalloc() immediately followed by a kfree:

	for (i = 0; i < kmalloc_iterations; i++) {
        	gunk[i] = kmalloc(kmalloc_size, GFP_KERNEL);
		kfree(gunk[i]);
	}

		size of 'struct page':
kmalloc size	64-byte 48-byte
8		58.6	43.0
32		59.3	43.0
128		59.4	43.2
256		57.4	42.8
1024		80.4	43.0
4096		76.0	43.8
8192		79.9	43.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>