On Fri, Jul 15, 2022 at 7:52 AM Taylor Blau <me@xxxxxxxxxxxx> wrote: > > By the end of this loop, we have an array `commit_positions` which maps > the ith selected commit to its lexical position among all objects in the > bitmap. IOW, `commit_positions[i] = j` means the `i`th selected commit > can be found at index `j` among all objects in the pack/MIDX in their > lexical order. Right. > > + /* > > + * At the end of this sort table[j] = i means that the i'th > > + * bitmap corresponds to j'th bitmapped commit in lex order of > > + * OIDs. > > + */ > > + QSORT_S(table, writer.selected_nr, table_cmp, commit_positions); > > And then we sort table by treating its values as indexes into > `commit_positions`. Here's where I'm not sure that I follow what's going > on. You say above that `table[j] = i`, where `i` corresponds to the > order of selected commits, and `j` is in lexical order. Correct. > If that's the case, then I'd expect that printing `index[table[j]]` for > increasing `j` would output OIDs in increasing lexical order. But that > doesn't quite seem to be the case. From a debugger session that has a > breakpoint after computing and sorting table, along with building > `table_inv`: > > (gdb) p oid_to_hex(&index[table[0]]->oid) > $17 = 0x555555983ea0 <hexbuffer> "0006763074748d43b539c1c8e8882c08034ab178" > (gdb) p oid_to_hex(&index[table[1]]->oid) > $18 = 0x555555983ee1 <hexbuffer+65> "001ce83dd43f03dcfc67f29d38922e4a9682aab0" > (gdb) p oid_to_hex(&index[table[2]]->oid) > $19 = 0x555555983f22 <hexbuffer+130> "002db882ece2ab6a240e495a169c6e06422289c8" > (gdb) p oid_to_hex(&index[table[3]]->oid) > $20 = 0x555555983f63 <hexbuffer+195> "0007a5feb040e1ff704f3ad636619ddca3e7382b" > > that doesn't look like the OIDs are increasing in lexical order. > > I'm not quite sure if I'm even looking at the right thing, or if this is > to be expected, or if the comment isn't quite accurate. If you could > help clarify what's going on here, that would be great. I think you're not looking at the right thing. you should look at `writer.selected[table[i]].commit->object.oid` instead. I think the order of `index[]` is not the same as the pack index (or midx). I am saying this because if we use the `pos` variable (that we get from `commit_bitmap_writer_pos(&writer.selected[table[i]].commit->object.oid, index, index_nr)`) in `fprintf(stderr, "commit hex: %s\n", &index[pos]->oid);`, you'll see that `&index[pos]->oid` and `&writer.selected[table[i]].commit->object.oid` are not same. So, If you do - int spos = commit_bitmap_writer_pos(&index[pos]->oid, index, index_nr); you'll see `spos` is not equal to `pos`. > > + /* table_inv helps us discover that relationship (i'th bitmap > > + * to j'th commit by j = table_inv[i]) > > + */ > > + for (i = 0; i < writer.selected_nr; i++) > > + table_inv[table[i]] = i; > > This part makes sense, as does the rest of the implementation. > > Thanks, > Taylor