Junio, Sorry for the repeat, accidentally sent as HTML, rejected by the list. On Wednesday, June 22, 2011, Junio C Hamano wrote: David Barr <davidbarr@xxxxxxxxxx> writes: > +void *pool_alloc(struct mem_pool *pool, size_t len, size_t *id_out) > +{ > + static size_t id = 1; Does this mean that even though you can have more than one mem_pool object active in the system, you won't have id collision throughout the system? That is a nice property (e.g. given an ID that does not belong to a pool, you won't risk returning a wrong chunk of <mem,len> pair from pool_ptr()), but is there a downside using function-scope static like this, I wonder? This is an artifact that I missed in refactoring my experimental code. This ought to be a field in struct mem_pool. My intent was for id's to be unique and contiguous within a pool. For example, if I have two pools A and B, and call pool_alloc on A and then on B and then on A again, A's space[0] will have the first and the third object, with id=1 and id=3. How does this interact with your implementation of pool_ptr() which seems to assume that id's are consecutive within a single pool->space[]? It would interact very poorly. > + size_t n; > + void *r; > + > + if ((pool->end - pool->next_free >= len) && > + (pool->len_free >= sizeof_varint(len))) > + n = pool->nr - 1; Help me make sure I understand what is going on here. A mem-pool has pool->nr chunks (and it can grow as more memory is asked from this pool). A new memory request is satisfied by carving out of pool->space[n] (where n is the "newest chunk" in the pool). pool->len[n] is a fixed sized byte array and stores the length of each memory block carved out of pool->space[n] as a sequence of varint. If pool->space[n] has enough space to fit "len" bytes, and if pool->len[n] still has enough space to record the length, you use the current chunk, otherwise (i.e. the else clause) you add a new chunk. Am I with you so far? Absolutely. With the chunk_size of 2MB, you would fit roughly 16k allocation requests for 128-byte memory, and you would need 2-bytes to express the length of one piece of memory in your varint encoding, i.e. you would need to size an element of pool->len[] to 32kB if you wanted to store 16k allocation of 128-byte for a chunk. This would all depend on what the expected distribution of request size, but it somehow feels wasteful to be limited by both space[] and len[]. If you chose sizeof(*pool->len) that is too small for the workload, wouldn't you end up allocating many 2MB space[], only to use potentially very initial parts of them before len[] fills up? Yes, this is a weakness of this iteration of the design. > + if (id_out) > + *id_out = id; > + id++; > + > + char *t = &pool->len[n][sizeof(*pool->len) - pool->len_free]; Please avoid decl_after_statement. Thanks for the reminder, will clean up some more. > + size_t n = pool->nr * id / pool->first_id[pool->nr - 1]; > + if (n >= pool->nr - 1) > + n = pool->nr - 1; > + while (n && pool->first_id[n] > id) > + n--; > + while (n + 1 < pool->nr && pool->first_id[n + 1] <= id) > + n++; I was about to say "bsearch?", but perhaps it is not worth it. A linear guesstimate is typically off by a small amount, so linear search is fine. Also, the index of id's is contiguous, so linear search has good cache locality. If I address the next concern in this review, this search will no longer be necessary. > +struct mem_pool { > + size_t *first_id; > + char **space; > + char (*len)[sizeof(size_t) + sizeof(char*)]; Each element of pool->len[] is just a byte-array that stores varint, no? It is very misleading to specify its size as sizeof(size_t)+sizeof(char*) as if you would store a "struct { size_t some; char *thing; }" here. Instead of having two independently depleted byte-buffer (space[] and len[]), I wonder if it would be more space efficient (without being less processing efficient) to use a single buffer space. Your pool_ptr() would start at the beginning of pool->space[n], decode a varint and take it as a length, if that is not the object you are looking for, skip that many bytes (i.e. payload immediately follows the length) to the next object, and so on. I have already investigated this arrangement, it has very poor locality of access. For objects <32 bytes long, its not too bad since typically 2 bytes of a 64 byte cache line would be read consecutively. For larger objects this is pathological cache behavior. On the other hand, the current design means that the entire sequence of lengths will fit on a single >=16 byte cache line. Also what kind of alignment guarantee would we _want_ to give the callers? As far as I can tell, this implementation does not guarantee any alignment. No alignment guarantee provided. An interesting possibility is to provide a guarantee and use the padding bytes for metadata. Although on second reading I think I must have mis-read, I've been investigating fixing the number of objects per chunk rather than fixing the space for lengths. I have an inkling that I already considered this and found that it introduced a nasty corner case. This investigation is ongoing. -- David Barr. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html