On Wed, 2009-01-21 at 18:58 -0500, Christoph Lameter wrote: > On Tue, 20 Jan 2009, Zhang, Yanmin wrote: > > > kmem_cache skbuff_head_cache's object size is just 256, so it shares the kmem_cache > > with :0000256. Their order is 1 which means every slab consists of 2 physical pages. > > That order can be changed. Try specifying slub_max_order=0 on the kernel > command line to force an order 0 alloc. I tried slub_max_order=0 and there is no improvement on this UDP-U-4k issue. Both get_page_from_freelist and __free_pages_ok's cpu time are still very high. I checked my instrumentation in kernel and found it's caused by large object allocation/free whose size is more than PAGE_SIZE. Here its order is 1. The right free callchain is __kfree_skb => skb_release_all => skb_release_data. So this case isn't the issue that batch of allocation/free might erase partial page functionality. '#slaninfo -AD' couldn't show statistics of large object allocation/free. Can we add such info? That will be more helpful. In addition, I didn't find such issue wih TCP stream testing. > > The queues of the page allocator are of limited use due to their overhead. > Order-1 allocations can actually be 5% faster than order-0. order-0 makes > sense if pages are pushed rapidly to the page allocator and are then > reissues elsewhere. If there is a linear consumption then the page > allocator queues are just overhead. > > > Page allocator has an array at zone_pcp(zone, cpu)->pcp to keep a page buffer for page order 0. > > But here skbuff_head_cache's order is 1, so UDP-U-4k couldn't benefit from the page buffer. > > That usually does not matter because of partial list avoiding page > allocator actions. > > > SLQB has no such issue, because: > > 1) SLQB has a percpu freelist. Free objects are put to the list firstly and can be picked up > > later on quickly without lock. A batch parameter to control the free object recollection is mostly > > 1024. > > 2) SLQB slab order mostly is 0, so although sometimes it calls alloc_pages/free_pages, it can > > benefit from zone_pcp(zone, cpu)->pcp page buffer. > > > > So SLUB need resolve such issues that one process allocates a batch of objects and another process > > frees them batchly. > > SLUB has a percpu freelist but its bounded by the basic allocation unit. > You can increase that by modifying the allocation order. Writing a 3 or 5 > into the order value in /sys/kernel/slab/xxx/order would do the trick. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html