On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche <bvanassche@xxxxxxx> wrote: > > On Mon, 2018-11-05 at 23:14 +0100, Rasmus Villemoes wrote: > > Won't that pessimize the cases where gfp is a constant to actually do > > the table lookup, and add 16 bytes to every translation unit? > > > > Another option is to add a fake KMALLOC_DMA_RECLAIM so the > > kmalloc_caches[] array has size 4, then assign the same dma > > kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a > > dozen pointers in .data), and then just compute kmalloc_type() as > > > > ((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >> > > someothershift). > > > > Perhaps one could even shuffle the GFP flags so the two shifts are the same. > > How about this version, still untested? My compiler is able to evaluate > the switch expression if the argument is constant. > > static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags) > { > - int is_dma = 0; > - int type_dma = 0; > - int is_reclaimable; > + unsigned int dr = !!(flags & __GFP_RECLAIMABLE); > > #ifdef CONFIG_ZONE_DMA > - is_dma = !!(flags & __GFP_DMA); > - type_dma = is_dma * KMALLOC_DMA; > + dr |= !!(flags & __GFP_DMA) << 1; > #endif > > - is_reclaimable = !!(flags & __GFP_RECLAIMABLE); > - > /* > * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return > * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE > */ > - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM; > + switch (dr) { > + default: > + case 0: > + return 0; > + case 1: > + return KMALLOC_RECLAIM; > + case 2: > + case 3: > + return KMALLOC_DMA; > + } > } > > Bart. Doesn't this defeat the whole point of the code which I thought was to avoid conditional jumps and branches? Also why would you bother with the "dr" value when you could just mask the flags value and switch on that directly?