Re: [PATCH v2 05/10] split-index.c: dump "link" extension as json

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 6/27/2019 6:48 AM, Duy Nguyen wrote:
On Tue, Jun 25, 2019 at 7:40 PM Derrick Stolee <stolee@xxxxxxxxx> wrote:

On 6/25/2019 6:29 AM, Duy Nguyen wrote:
On Tue, Jun 25, 2019 at 3:06 AM Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote:
I'm curious how big these EWAHs will be in practice and
how useful an array of integers will be (especially as the
pretty format will be one integer per line).  Perhaps it
would helpful to have an extended example in one of the
tests.

It's one integer per updated entry. So if you have a giant index and
updated every single one of them, the EWAH bitmap contains that many
integers.

If it was easy to just merge these bitmaps back to the entry (e.g. in
this example, add "replaced": true to entry zero) I would have done
it. But we dump as we stream and it's already too late to do it.

Would it be better to have the caller of ewah_each_bit()
build a hex or bit string in a strbuf and then write it
as a single string?

I don't think the current EWAH representation is easy to read in the
first place. You'll probably have to run through some script to update
the main entries part and will have a much better view, but that's
pretty quick. If it's for scripts, then it's probably best to keep as
an array of integers, not a string. Less post processing.

I don't think the intent is to dump the EWAH directly, but instead to
dump a string of the uncompressed bitmap. Something like:

         "delete_bitmap" : "01101101101"

instead of

         "delete_bitmap" : [ 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1 ]

I get this part. But the numbers in the array were the position of the
set bits. It's not showing just the actual bit map.

The same bitmap would be currently displayed as

  "delete_bitmap": [ 1, 2, 4, 5, 7, 8, 9, 11 ]

And that maps back to the entry[1], entry[2], entry[4]... in the index
being deleted from the base index. So displaying as a real bit map
actually adds more work for both the reader and the tool because you
have to calculate the position either way. And it gets harder if the
bit you're intereted in is on the far right.


Thanks for the clarification.  That helps.

Jeff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux