Re: [PATCH 2/3] commit-slab: avoid large realloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Apr 13, 2013 at 11:04:48PM -0700, Junio C Hamano wrote:

> Instead of using a single "slab" and keep reallocating it as we find
> that we need to deal with commits with larger values of commit->index,
> make a "slab" an array of many "slab_piece"s. Each access may need
> two levels of indirections, but we only need to reallocate the first
> level array of pointers when we have to grow the table this way.

I don't know if shrinking the size of the realloc is all that big a
deal. We are doubling, so the allocation cost is already amortized
constant time.

Whereas the cost of the extra level of indirection is paid for every
lookup. So for some workloads, I can imagine this actually being slower
(I haven't yet measured it for --topo-order, though).

An interesting side effect of this patch is that the pointers returned
by slab_at() are stable until slab_clear(). It doesn't matter for
--topo-order, but if, for example, you had a recursive function, you
would not have to worry about invalidation in sub-functions. I.e.,
before this patch, this would be wrong:

  static struct commit_slab generation_cache;

  int commit_generation(struct commit *c)
  {
          int *gp = slab_at(&generation_cache, c);

          if (!*gp) {
                  int max = 0;
                  struct commit_list *p;

                  parse_commit(c);
                  for (p = c->parents; p; p = p->next) {
                          int g = commit_generation(p->item);
                          if (max < g)
                                  max = g;
                  }

                  *gp = max + 1;
          }

          return *gp;
  }

because the recursive calls might invalidate the "gp" pointer (you would
have to call slab_at repeatedly instead). Whereas with your patch, it's
fine.

>  struct commit_slab {
> -	int *buf;
> -	int alloc;
> +	int piece_size;
> +	int piece_count;
> +	struct commit_slab_piece **piece;
>  };

Is there a reason to make piece_size a struct member? It must be
constant after any pieces are allocated. Is the intent that callers
could do:

  slab_init(&s);
  /* I know ahead of time we are going to need N of these. */
  s.piece_size = n;

? I think that is OK (though they do not need slab_* in that case at
all), but we should probably have a warning in the struct that
piece_size should never be touched except in that circumstance.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]