Re: Why n_buckets is at least max_entries?

Cong Wang <xiyou.wangcong@xxxxxxxxx> · Tue, 15 Dec 2020 18:39:26 -0800

On Tue, Dec 15, 2020 at 6:29 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Tue, Dec 15, 2020 at 5:55 PM Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
> >
> > On Tue, Dec 15, 2020 at 5:45 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Tue, Dec 15, 2020 at 1:44 PM Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
> > > >
> > > > Hello,
> > > >
> > > > Any reason why we allocate at least max_entries of buckets of a hash map?
> > > >
> > > >  466
> > > >  467         /* hash table size must be power of 2 */
> > > >  468         htab->n_buckets = roundup_pow_of_two(htab->map.max_entries);
> > >
> > > because hashmap performance matters a lot.
> >
> > Unless you believe hash is perfect, there is always conflict no matter
> > how many buckets we have.
> >
> > Other hashmap implementations also care about performance, but none
> > of them allocates the number of buckets in this aggressive way. In some
> > particular cases, for instance max_entries=1025, we end up having almost
> > buckets twice of max_entries.
>
> Do you have any numbers to prove that max_entries > 1024 with n_buckets == 1024
> would still provide the same level of performance?

No, I assume you had when you added this implementation?

Also, it depends on what performance you are talking about too, the
lookup path is lockless so has nothing to do with the number of buckets.

The update path does content for bucket locks, but it is arguably the
slow path.

Thanks.