On Wed, 21 Mar 2018, Christopher Lameter wrote: > On Wed, 21 Mar 2018, Mikulas Patocka wrote: > > > So, what would you recommend for allocating 640KB objects while minimizing > > wasted space? > > * alloc_pages - rounds up to the next power of two > > * kmalloc - rounds up to the next power of two > > * alloc_pages_exact - O(n*log n) complexity; and causes memory > > fragmentation if used excesivelly > > * vmalloc - horrible performance (modifies page tables and that causes > > synchronization across all CPUs) > > > > anything else? > > Need to find it but there is a way to allocate N pages in sequence > somewhere. Otherwise mempools are something that would work. There's also continuous-memory-allocator, but it needs its memory to be reserved at boot time. It is intended for misdesigned hardware devices that need continuous memory for DMA. As it's intended for one-time allocations when loading drivers, it lacks the performance and scalability of the slab cache and alloc_pages. > > > > > What kind of problem could be caused here? > > > > > > > > Unlocked accesses are generally considered bad. For example, see this > > > > piece of code in calculate_sizes: > > > > s->allocflags = 0; > > > > if (order) > > > > s->allocflags |= __GFP_COMP; > > > > > > > > if (s->flags & SLAB_CACHE_DMA) > > > > s->allocflags |= GFP_DMA; > > > > > > > > if (s->flags & SLAB_RECLAIM_ACCOUNT) > > > > s->allocflags |= __GFP_RECLAIMABLE; > > > > > > > > If you are running this while the cache is in use (i.e. when the user > > > > writes /sys/kernel/slab/<cache>/order), then other processes will see > > > > invalid s->allocflags for a short time. > > > > > > Calculating sizes is done when the slab has only a single accessor. Thus > > > no locking is neeed. > > > > The calculation is done whenever someone writes to > > "/sys/kernel/slab/*/order" > > But the flags you are mentioning do not change and the size of the object > does not change. What changes is the number of objects in the slab page. See this code again: > > > s->allocflags = 0; > > > if (order) > > > s->allocflags |= __GFP_COMP; > > > > > > if (s->flags & SLAB_CACHE_DMA) > > > s->allocflags |= GFP_DMA; > > > > > > if (s->flags & SLAB_RECLAIM_ACCOUNT) > > > s->allocflags |= __GFP_RECLAIMABLE; when this function is called, the value s->allocflags does change. At the end, s->allocflags holds the same value as before, but it changes temporarily. For example, if someone creates a slab cache with the flag SLAB_CACHE_DMA, and he allocates an object from this cache and this allocation races with the user writing to /sys/kernel/slab/cache/order - then the allocator can for a small period of time see "s->allocflags == 0" and allocate a non-DMA page. That is a bug. Mikulas