Re: How DELTA objects values work and are calculated

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 6, 2019 at 5:32 AM Farhan Khan <khanzf@xxxxxxxxx> wrote:
> Hi Duy,
>
> Thanks for explaining the Delta objects.
>
> What does a OBJ_REF_DELTA object itself consist of?

from pack-format.txt

     (deltified representation)
     n-byte type and length (3-bit type, (n-1)*7+4-bit length)
     20-byte base object name if OBJ_REF_DELTA or a negative relative
offset from the delta object's position in the pack if this
is an OBJ_OFS_DELTA object
     compressed delta data


> Do you have to uncompress it to parse its values?

The delta part is compressed, so yes. The "base object name" is not.

> How do you get its size?

Uncompress until the end the delta until the end. zlib stream has some
sort of "end-of-stream" marker so it knows when to stop.

> I read through resolve deltas which leads to threaded_second_pass, where
> you suggested to start, but I do not understand what is happening at a
> high level and get confused while reading the code.
>
>  From threaded_second_pass, execution goes into a for-loop that runs
> resolve_base(), which runs runs find_unresolved_deltas(). Is this
> finding the unresolved deltas of the current object (The current
> OBJ_REF_DELTA we are going through)? This then runs
> find_unresolved_deltas() and shortly afterwards
> find_unresolved_deltas_1(). It seems that find_unresolved_deltas_1() is
> applying deltas, but I am not certain.

Ah I forgot how "fun" these functions were :) The obvious way to
resolve an delta object is to resolve (recursively) its base object
first, then you apply delta on top and are done. However that implies
recursion, and also not really cache friendly. So what
find_unresolve_deltas_1() does is backward. It starts at a (already
resolved, e.g. non-delta) base object, then applies deltas for all
delta objects that immediately depend on it, then continue to resolve
delta objects depending on these children... The
find_*_delta_children() functions find these deltas, then
find_unresolve_deltas_1() will call resolve_delta() to do the real
work

- the delta type (OBJ_REF_.. or OBJ_OFS_...) is already known at this
point. I believe we know from the first pass
- the delta is uncompressed here, with get_data_from_pack()
- the base object is obtained via get_base_data(), which is recursive,
but since we go backwards from parent to child, base->data should be
already valid and get_base_data() becomes no-op

> I do not understand what is happening in any of these functions. There
> are some comments on builtin/index-pack.c:883-904
>
> Overall, I do not understand this entire process, what values to capture
> along the way, and how they are consumed. Please provide some guidance
> on how this process works.

An easier way to understand this is actually run it through a debugger
(in single thread mode). Create a small repo with a handful of deltas.
Use "git verify-pack -v" to see what object is delta and where... then
you have something to double check while you step through the code.

>
> Thank you!
> Farhan
-- 
Duy



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux