Re: [PATCH 0/4] zsmalloc improvements

Konrad Rzeszutek Wilk <konrad@xxxxxxxxxx> · Wed, 11 Jul 2012 15:42:06 -0400

>>> Which architecture was this under? It sounds x86-ish? Is this on
>>> Westmere and more modern machines? What about Core2 architecture?
>>>
>>> Oh how did it work on AMD Phenom boxes?
>>
>> I don't have a Phenom box but I have an Athlon X2 I can try out.
>> I'll get this information next Monday.
>
> Actually, I'm running some production stuff on that box, so
> I rather not put testing stuff on it.  Is there any
> particular reason that you wanted this information? Do you
> have a reason to believe that mapping will be faster than
> copy for AMD procs?

Sorry for the late response. Working on some ugly bug that is taking
more time than anticipated.
My thoughts were that these findings are based on the hardware memory
prefetcher. The Intel
machines - especially starting with Nehelem have some pretty
impressive prefetcher where
even doing in a linked list 'prefetch' on the next node is not beneficial.

Perhaps the way to leverage this is to use different modes depending
on the bulk of data?
When there is a huge amount use the old method, but for small use copy
(as it would
in theory stay in the cache longer).
_______________________________________________
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxx
http://driverdev.linuxdriverproject.org/mailman/listinfo/devel