I was wrong about the 2GB size and memset, neither this size nor the use of memset is the important factor here. The slowness really comes in with any access to memory returned from an ioremap_cache() call which crosses a physical 4GB memory boundary... I can map 4GB at offset 4GB, and memset that, and it is relatively quick, but if I map 2GB at offset 3GB, then memset even just the first 1GB of that is extremely slow. Mapping 2GB at 2GB is fast again. Any ideas? Obviously the magical 4GB is the limit of a 32bit value, but why should that matter on a 64bit CPU and 64bit kernel? This shouldn't trigger PAE should it? This is on stock 64bit Ubuntu 10.04.1 LTS kernel : # # uname -srvmpio Linux 2.6.32-21-generic #32-Ubuntu SMP Fri Apr 16 08:09:38 UTC 2010 x86_64 unknown unknown GNU/Linux from /proc/cpuinfo: model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz On 01 Feb 2011, at 12:23 PM, Jason Nymble wrote: > I was using memset on a reserved area of memory (64bit x86 kernel and system), and noticed that as soon as I exceed a size of 2GB, the function becomes extremely slow, e.g just below 2GB it takes typically about 0.3s, and just above 2GB is takes about 39s to complete... > > I tried tracing the eventual function that is called in the kernel, and I think it resolves to the below (even on x86_64 if I'm not mistaken): > static inline void *__memset_generic(void *s, char c, size_t count) > { > int d0, d1; > asm volatile("rep\n\t" > "stosb" > : "=&c" (d0), "=&D" (d1) > : "a" (c), "1" (s), "0" (count) > : "memory"); > return s; > } > > size_t is defined as (unsigned long) on my platform, but I suspect the d0 and d1 variables above cause problems because they are int... Is this a kernel bug, or known limitation, or what? _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies