On Sunday 22 June 2014 at 00:15:49, Ivan Shapovalov wrote: > On Saturday 21 June 2014 at 13:20:18, Edward Shishkin wrote: > > On 06/21/2014 12:35 AM, Ivan Shapovalov wrote: > > > [...] > > > Also I would like if this code could be given a review. :) > > > > Great! Looks nice for me, thanks! > > There are 2 issues, though... > > > > 1) kmalloc/kfree a huge number of 32-byte chunks (blocknr_list entries) is > > suboptimal. There is a special low-level memory allocator for such purposes. > > Take a look how we initialize so-called "slab cache" for jnodes > > (_jnode_slab), > > atoms (_atom_slab), etc, and allocate memory for them (kmem_cache_alloc()). > > > > 2) A lot of blocknr_list entries are allocated at flush time, when the > > high-level > > allocator (txmod.c) makes "relocation decisions" (especially when txmod=wa). > > The problem is that the flush (with the following commit) usually is the > > file system > > response to memory pressure notifications, when additional memory allocation > > is not desirable. > > > > I think that with the fixed (1) we'll include the discard support (if > > everything will > > be OK in the next 1-2 weeks). > > > > As to (2): that is a common problem of all Linux subsystems which want > > memory > > to free memory. It is unresolvable, however, we can improve the > > situation. It > > would be nice to implement a per-atom pool of memory (as a list of > > kmalloc-ed > > buffers with "cursors") with an optional possibility to pre-allocate 1-2 > > such buffers > > at atom initialization time. But this is for the future... > > > > I don't see other urgent improvements. Yes, overall scalability of > > rb-trees is better, > > as we found, however, merging rb-trees is more expensive, plus atom's fusion > > is not a background process, so it can lead to performance drop. There are > > rb-trees with fingers, however I haven't seen their implementation on C > > language > > (it can be not so simple). > > > > Thanks! > > Edward. > > > > Thanks for the review! > > (1) surely seems trivial. Do we need something similar for blocknr_set as well? ...I suppose yes, because there is at least one blocknr_set_entry per each txn_atom, while the latter is already allocated with slab allocator). -- Ivan Shapovalov / intelfx / > > For (2) I don't quite understand you. How such a pool should be organized? > > Do you mean `blocknr_list_entry **pool` of size N * sizeof(void*) filled with > kmem_cache_alloc()'d pointers, or just `blocknr_list_entry *pool` of size > N * sizeof(blocknr_list_entry)? > >
Attachment:
signature.asc
Description: This is a digitally signed message part.