On 3/26/21 4:13 PM, Andrii Nakryiko wrote:
On Wed, Mar 24, 2021 at 11:53 PM Yonghong Song <yhs@xxxxxx> wrote:
Currently, types/tags hash table has fixed HASHTAGS__BITS = 15.
That means the number of buckets will be 1UL << 15 = 32768.
In my experiments, a thin-LTO built vmlinux has roughly 9M entries
in types table and 5.2M entries in tags table. So the number
of buckets is too less for an efficient lookup. This patch
refactored the code to allow the number of buckets to be changed.
In addition, currently hashtags__fn(key) return value is
assigned to uint16_t. Change to uint32_t as in a later patch
the number of hashtag bits can be increased to be more than 16.
Signed-off-by: Yonghong Song <yhs@xxxxxx>
---
dwarf_loader.c | 48 +++++++++++++++++++++++++++++++++++++-----------
1 file changed, 37 insertions(+), 11 deletions(-)
diff --git a/dwarf_loader.c b/dwarf_loader.c
index c106919..a02ef23 100644
--- a/dwarf_loader.c
+++ b/dwarf_loader.c
@@ -50,7 +50,12 @@ struct strings *strings;
#define DW_FORM_implicit_const 0x21
#endif
-#define hashtags__fn(key) hash_64(key, HASHTAGS__BITS)
+static uint32_t hashtags__bits = 15;
+
+uint32_t hashtags__fn(Dwarf_Off key)
+{
+ return hash_64(key, hashtags__bits);
I vaguely remember pahole patch that updated hash function to use the
same one as libbpf's hashmap is using. Arnaldo, wasn't that patch
accepted?
But more to the point, I think hashtags__fn() should probably preserve
all 64 bits of the hash?
I don't know the context. If the purpose is to avoid future changes
in case that the hashtags__bits > 32 happens, yes, the change may
make sense.
+}
bool no_bitfield_type_recode = true;
@@ -102,9 +107,6 @@ static void dwarf_tag__set_spec(struct dwarf_tag *dtag, dwarf_off_ref spec)
*(dwarf_off_ref *)(dtag + 1) = spec;
}
-#define HASHTAGS__BITS 15
-#define HASHTAGS__SIZE (1UL << HASHTAGS__BITS)
-
#define obstack_chunk_alloc malloc
#define obstack_chunk_free free
@@ -118,22 +120,41 @@ static void *obstack_zalloc(struct obstack *obstack, size_t size)
}
struct dwarf_cu {
- struct hlist_head hash_tags[HASHTAGS__SIZE];
- struct hlist_head hash_types[HASHTAGS__SIZE];
+ struct hlist_head *hash_tags;
+ struct hlist_head *hash_types;
struct obstack obstack;
struct cu *cu;
struct dwarf_cu *type_unit;
};
-static void dwarf_cu__init(struct dwarf_cu *dcu)
+static int dwarf_cu__init(struct dwarf_cu *dcu)
{
+ uint64_t hashtags_size = 1UL << hashtags__bits;
I wish pahole could just use libbpf's dynamically resized hashmap,
instead of hard-coding maximum size like this :(
Arnaldo, libbpf is not going to expose its hashmap as public API, but
if you'd like to use it, feel free to just copy/paste the code. It
hasn't change for a while and is unlikely to change (unless some day
we decide to make more efficient open-addressing implementation).
+ dcu->hash_tags = malloc(sizeof(struct hlist_head) * hashtags_size);
+ if (!dcu->hash_tags)
+ return -ENOMEM;
+
+ dcu->hash_types = malloc(sizeof(struct hlist_head) * hashtags_size);
+ if (!dcu->hash_types) {
+ free(dcu->hash_tags);
+ return -ENOMEM;
+ }
+
[...]