* Huang Ying (ying.huang@xxxxxxxxx) wrote: > On 04/29/2011 09:11 AM, Mathieu Desnoyers wrote: > > * Huang Ying (ying.huang@xxxxxxxxx) wrote: > >> Hi, Mathieu, > >> > >> Thanks for your comments. > >> > >> On 04/28/2011 10:37 PM, Mathieu Desnoyers wrote: > >>> * Huang Ying (ying.huang@xxxxxxxxx) wrote: > >> [snip] > >>>> > >>>> +/** > >>>> + * gen_pool_for_each_chunk - iterate over chunks of generic memory pool > >>>> + * @chunk: the struct gen_pool_chunk * to use as a loop cursor > >>>> + * @pool: the generic memory pool > >>>> + * > >>>> + * Not lockless, proper mutual exclusion is needed to use this macro > >>>> + * with other gen_pool function simultaneously. > >>>> + */ > >>>> +#define gen_pool_for_each_chunk(chunk, pool) \ > >>>> + list_for_each_entry_rcu(chunk, &(pool)->chunks, next_chunk) > >>> > >>> Is it just me or this macro is never used ? Maybe you should consider > >>> removing it. > >> > >> This macro is not used in this patch. But it is used in 4/4 of the > >> patchset to free the backing pages before destroy the pool. > > > > Depending on how frequently you want to use it, you might want to use > > list_for_each_entry_rcu directly rather than a macro wrapper. E.g. for > > 2-3 uses, adding a macro just obfuscates the code IMHO (e.g. you don't > > know it iterates on a RCU list by looking at the caller code). > > Yes. gen_pool_for_each_chunk() is not a good wrapper. I just don't want > to expose too much implementation details to users, after all, we are > working on library code. Maybe something like below is better? > > void gen_pool_for_each_chunk(struct gen_pool *pool, void (*func)(struct > gen_pool *pool, struct gen_pool_chunk *chunk)) { > rcu_read_lock(); > list_for_each_entry_rcu(chunk, &pool->chunks, next_chunk) > func(pool, chunk); > rcu_read_unlock(); > } If it is expected to be exposed to other parts of the kernel, indeed we should not expect the caller to magically know they must hold the rcu read-side lock. I'm not sure whether this iterator is necessary though. Just a comment could suffice. > > >> > >> [snip] > >>>> @@ -108,43 +226,50 @@ EXPORT_SYMBOL(gen_pool_destroy); > >>>> * @size: number of bytes to allocate from the pool > >>>> * > >>>> * Allocate the requested number of bytes from the specified pool. > >>>> - * Uses a first-fit algorithm. > >>>> + * Uses a first-fit algorithm. Can not be used in NMI handler on > >>>> + * architectures without NMI-safe cmpxchg implementation. > >>>> */ > >>>> unsigned long gen_pool_alloc(struct gen_pool *pool, size_t size) > >>>> { > >>>> - struct list_head *_chunk; > >>>> struct gen_pool_chunk *chunk; > >>>> - unsigned long addr, flags; > >>>> + unsigned long addr; > >>>> int order = pool->min_alloc_order; > >>>> - int nbits, start_bit, end_bit; > >>>> + int nbits, start_bit = 0, end_bit, remain; > >>>> + > >>>> +#ifndef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG > >>>> + BUG_ON(in_nmi()); > >>>> +#endif > >>>> > >>>> if (size == 0) > >>>> return 0; > >>>> > >>>> nbits = (size + (1UL << order) - 1) >> order; > >>>> - > >>>> - read_lock(&pool->lock); > >>>> - list_for_each(_chunk, &pool->chunks) { > >>>> - chunk = list_entry(_chunk, struct gen_pool_chunk, next_chunk); > >>>> + rcu_read_lock(); > >>>> + list_for_each_entry_rcu(chunk, &pool->chunks, next_chunk) { > >>>> + if (size > atomic_read(&chunk->avail)) > >>>> + continue; > >>>> > >>>> end_bit = (chunk->end_addr - chunk->start_addr) >> order; > >>>> - > >>>> - spin_lock_irqsave(&chunk->lock, flags); > >>>> - start_bit = bitmap_find_next_zero_area(chunk->bits, end_bit, 0, > >>>> - nbits, 0); > >>>> - if (start_bit >= end_bit) { > >>>> - spin_unlock_irqrestore(&chunk->lock, flags); > >>>> +retry: > >>>> + start_bit = bitmap_find_next_zero_area(chunk->bits, end_bit, > >>>> + start_bit, nbits, 0); > >>>> + if (start_bit >= end_bit) > >>>> continue; > >>>> + remain = bitmap_set_ll(chunk->bits, start_bit, nbits); > >>>> + if (remain) { > >>>> + remain = bitmap_clear_ll(chunk->bits, start_bit, > >>>> + nbits - remain); > >>>> + BUG_ON(remain); > >>>> + goto retry; > >>>> } > >>>> > >>>> addr = chunk->start_addr + ((unsigned long)start_bit << order); > >>>> - > >>>> - bitmap_set(chunk->bits, start_bit, nbits); > >>>> - spin_unlock_irqrestore(&chunk->lock, flags); > >>>> - read_unlock(&pool->lock); > >>>> + size = nbits << order; > >>>> + atomic_sub(size, &chunk->avail); > >>>> + rcu_read_unlock(); > >>> > >>> I don't really like seeing a rcu_read_unlock() within a rcu list > >>> iteration (even if it comes right before a "return"). Doing: > >>> > >>> unsigned long addr = 0; > >>> > >>> rcu_read_lock(); > >>> list_for_each_entry_rcu(chunk, &pool->chunks, next_chunk) { > >>> if (...) > >>> continue; > >>> ... > >>> addr = ...; > >>> break; > >>> } > >>> rcu_read_unlock(); > >>> return addr; > >>> > >>> Would be more symmetric, and would remove one return path, which makes > >>> the code easier to modify in the future. > >> > >> Unlock in loop is common in Linux kernel. Sometimes it makes code > >> cleaner (but not always). Yes, for this case, we can avoid unlock in > >> loop easily. But for the next case it is not so clean. > > > > See comment below, > > > >> > >>>> return addr; > >>>> } > >>>> - read_unlock(&pool->lock); > >>>> + rcu_read_unlock(); > >>>> return 0; > >>>> } > >>>> EXPORT_SYMBOL(gen_pool_alloc); > >>>> @@ -155,33 +280,73 @@ EXPORT_SYMBOL(gen_pool_alloc); > >>>> * @addr: starting address of memory to free back to pool > >>>> * @size: size in bytes of memory to free > >>>> * > >>>> - * Free previously allocated special memory back to the specified pool. > >>>> + * Free previously allocated special memory back to the specified > >>>> + * pool. Can not be used in NMI handler on architectures without > >>>> + * NMI-safe cmpxchg implementation. > >>>> */ > >>>> void gen_pool_free(struct gen_pool *pool, unsigned long addr, size_t size) > >>>> { > >>>> - struct list_head *_chunk; > >>>> struct gen_pool_chunk *chunk; > >>>> - unsigned long flags; > >>>> int order = pool->min_alloc_order; > >>>> - int bit, nbits; > >>>> + int start_bit, nbits, remain; > >>>> > >>>> - nbits = (size + (1UL << order) - 1) >> order; > >>>> - > >>>> - read_lock(&pool->lock); > >>>> - list_for_each(_chunk, &pool->chunks) { > >>>> - chunk = list_entry(_chunk, struct gen_pool_chunk, next_chunk); > >>>> +#ifndef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG > >>>> + BUG_ON(in_nmi()); > >>>> +#endif > >>>> > >>>> + nbits = (size + (1UL << order) - 1) >> order; > > > > you could add: > > > > remain = nbits; > > > >>>> + rcu_read_lock(); > >>>> + list_for_each_entry_rcu(chunk, &pool->chunks, next_chunk) { > >>>> if (addr >= chunk->start_addr && addr < chunk->end_addr) { > >>>> BUG_ON(addr + size > chunk->end_addr); > >>>> - spin_lock_irqsave(&chunk->lock, flags); > >>>> - bit = (addr - chunk->start_addr) >> order; > >>>> - while (nbits--) > >>>> - __clear_bit(bit++, chunk->bits); > >>>> - spin_unlock_irqrestore(&chunk->lock, flags); > >>>> - break; > >>>> + start_bit = (addr - chunk->start_addr) >> order; > > > > You could turn this: > > > >>>> + remain = bitmap_clear_ll(chunk->bits, start_bit, nbits); > >>>> + BUG_ON(remain); > >>>> + size = nbits << order; > >>>> + atomic_add(size, &chunk->avail); > > > > into: > > > > remain = bitmap_clear_ll(chunk->bits, start_bit, nbits); > > size = nbits << order; > > atomic_add(size, &chunk->avail); > > break; > > > > > >>>> + rcu_read_unlock(); > >>> > >>> Same comment as above apply here. > >> > >> It is harder to remove unlock in loop here. An extra variable should be > >> used to indicate that something is freed from the pool. Do you think it > >> is cleaner to just keep the unlock in loop here? > >> > >> Best Regards, > >> Huang Ying > >> > >>> + return; > >>> } > >>> } > > > > And turn this: > > > >>> - BUG_ON(nbits > 0); > >>> - read_unlock(&pool->lock); > >>> + rcu_read_unlock(); > >>> + BUG(); > > > > into: > > > > BUG_ON(remain); > > rcu_read_unlock(); > > > > Does that look OK to you ? On the plus side, you end up having a single > > BUG_ON() in the function. > > I am afraid this make code a little harder to be understood. Why do you > hate unlock in loop so much? It is common in kernel and I think most > kernel developers are familiar with it. I'm fine either way for this function, no strong opinion on this one. Thanks, Mathieu > > Best Regards, > Huang Ying -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html