On Thu, Apr 02, 2020 at 08:40:58PM +0200, René Scharfe wrote: > >> Storing the objects directly in the set and getting rid of new_object() > >> could improve performance due to reduced dereferencing overhead -- or > >> make it slower because more data has to be copied when the hashmap needs > >> to grow. Worth a shot. Later. > > > > Yeah. I tried that, too, but it caused tons of test failures. Quite > > possibly just a bug in my patch, which I did as quickly as possible. But > > I think it performed about the same. Here it is for reference (though > > you may be better off to start from scratch). > > Tried that earlier, ran into failures as well. I think the pointer > returned from insert_object() is stored somewhere and still used after > the next call, which could have invalidated it by a rehash. E.g. > insert_mark() seems to do that. That doesn't surprise me. I didn't look very hard, but mostly just did the minimum to see if it would work (and it didn't). It could perhaps be overcome, and I certainly don't mind if you want to dig further, but I'm happy enough with the hashmap solution. > > Note the "this is OK to cast from oid to object_entry" comment is > > leftover from the earlier attempt, but it probably _isn't_ OK. Even > > though we don't do anything actionable on the non-oid bytes, they do get > > passed by value, which could mean reading random bytes. > > "Value" meaning pointer value when KHASH_INIT is give a pointer type, > as was the case in the patch with that comment (unlike in the patch > below). So it should be OK there. Right, I meant the comment no longer applies in the patch below, because we're not using a pointer type anymore. > > - for (o = blocks; o; o = o->next_pool) > > - for (e = o->next_free; e-- != o->entries;) > > + for (iter = kh_begin(&object_table); iter != kh_end(&object_table); iter++) { > > + if (kh_exist(&object_table, iter)) { > > + e = &kh_key(&object_table, iter); > > if (pack_id == e->pack_id) > > *c++ = &e->idx; > > + } > > + } > > The original code writes the objects in reverse order of their creation > (LIFO), right? Is that relevant? Yeah, agreed this is another weakness of this approach. > But anyway, we need stable object locations anyway if their addresses are > stored in other structs, as mentioned above. Those pointers would have to > be replaced by object_ids and pointer derefs by hashmap lookups. Not a > promising direction. Agreed. -Peff