On Wed, 10 Jun 2020 22:36:00 +0200 <vrd@xxxxxxxxxx> wrote: > On 6/8/20 1:40 PM, SeongJae Park wrote: > > From: SeongJae Park <sjpark@xxxxxxxxx> > > > > This commit implements DAMON's basic access check and region based > > sampling mechanisms. This change would seems make no sense, mainly > > because it is only a part of the DAMON's logics. Following two commits > > will make more sense. > > [...] > > + > > +/* > > + * Find three regions separated by two biggest unmapped regions > > + * > > + * vma the head vma of the target address space > > + * regions an array of three 'struct region's that results will be saved > > + * > > + * This function receives an address space and finds three regions in it which > > + * separated by the two biggest unmapped regions in the space. Please refer to > > + * below comments of 'damon_init_regions_of()' function to know why this is > > + * necessary. > > + * > > + * Returns 0 if success, or negative error code otherwise. > > + */ > > +static int damon_three_regions_in_vmas(struct vm_area_struct *vma, > > + struct region regions[3]) > > +{ > > + struct region gap = {0}, first_gap = {0}, second_gap = {0}; > > + struct vm_area_struct *last_vma = NULL; > > + unsigned long start = 0; > > + > > + /* Find two biggest gaps so that first_gap > second_gap > others */ > > + for (; vma; vma = vma->vm_next) { > > Since vm_area_struct already maintains information about the largest gap below this vma > in the mm_rb rbtree, walking the vma via mm_rb instead of the linked list, and skipping > the ones with don't fit the gap requirement via vma->rb_subtree_gap helps avoid the > extra comparisons in this function. Thanks for the idea! > > I measured the following implementation to be considerably faster as the number of > vmas grows for a process damon would attach to: > > -static int damon_three_regions_in_vmas(struct vm_area_struct *vma, > +static int damon_three_regions_in_vmas(struct rb_root *root, > struct region regions[3]) > { > + struct rb_node *nd = NULL; > struct region gap = {0}, first_gap = {0}, second_gap = {0}; > - struct vm_area_struct *last_vma = NULL; I like this cleanup. I'm so wonder how I forgot using '->vm_prev'. :) > + struct vm_area_struct *vma = NULL; > unsigned long start = 0; > > /* Find two biggest gaps so that first_gap > second_gap > others */ > - for (; vma; vma = vma->vm_next) { > - if (!last_vma) { > - start = vma->vm_start; > - last_vma = vma; > + for (nd = rb_first(root); nd; nd = rb_next(nd)) { > + vma = rb_entry(nd, struct vm_area_struct, vm_rb); This seems meaningless to me. This will iterate the vma tree in address order, as same to the old code. Moreover, 'rb_next()' and 'rb_entry()' might be slower than the direct reference of '->vm_next'. > + > + if (vma->rb_subtree_gap < sz_region(&second_gap)) { > + /* > + * Skip this vma if the largest gap at this vma is still > + * smaller than what we have encountered so far. > + */ > continue; This means we are skipping this node only. It would make no big difference from the old code, as we still iterate all nodes. Rather than that, by the definition of the '->rb_subtree_gap', we could skip all vmas in the subtree. For example: vma = rb_entry(rb_last(vma->vm_rb), struct vm_area_struct, vm_rb); continue; Nevertheless, this function is not the performance critical point, as this will be called only once for the initial time in this patch, and the followup patches will make this function to be called for every regions update interval, which defaults to 1 second. The followup patches will also allow users set the interval large enough and even configure their own optimized version. For the reason, I concern simpleness ratherthan performance here. That said, your fundamental idea obviously makes sense and the changes for that would be subtle. I will update this patch in abovely modified way and do some test. If I missed something, please let me know. Thanks, SeongJae Park