On Tue, Sep 13, 2022 at 8:03 PM Elijah Conners <business@xxxxxxxxxxxxxx> wrote: > > Han-Wen Nienhuys <hanwen@xxxxxxxxxx> writes: > > it might be a bit slower, but "dangerous"? How so? > In this context, dangerous is the wrong word, but in some cases large objects on the stack can cause stack overflows. In this case, slower is the right word here. I'll let you paint this bikeshed, but do note that the priority queue isn't actually optimal here, in a much bigger way. In the typical case, you'd have 1. large base reftable (created by GC) 2. small updates (created by individual ref updates) When you're iterating, most of the iteration entries will come from the large base reftable, and only occasionally, you have to get entries from the small tables. In this scenario, the current code will insert entries from the large table into the priority queue, have it filter up to the top at cost log(number-of-tables), for each of the entries to be read. With the current online compaction, number-of-tables = log(number-of-refs), so at log(log(number-of-refs)) it's not a huge cost, but certainly larger than the cost of copying the entry once in the function. JGit has an optimization here where it tries to get the next entry from the table that previously provided the minimum entry. I didn't implement it for simplicity's sake, but if you care about performance, you might want to try your hand at that. -- Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. -- Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Paul Manicle, Liana Sebastian