Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> writes: > The role of this comment block becomes more important after we shuffle > fields around to shrink this struct. It will be much harder to see what > field is related to what. > > Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx> > --- > pack-objects.h | 44 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 44 insertions(+) > > diff --git a/pack-objects.h b/pack-objects.h > index 03f1191659..85345a4af1 100644 > --- a/pack-objects.h > +++ b/pack-objects.h > @@ -1,6 +1,50 @@ > #ifndef PACK_OBJECTS_H > #define PACK_OBJECTS_H > > +/* > + * basic object info > + * ----------------- > + * idx.oid is filled up before delta searching starts. idx.crc32 and > + * is only valid after the object is written out and will be used for "and is"? > + * generating the index. idx.offset will be both gradually set and > + * used in writing phase (base objects get offset first, then deltas > + * refer to them) > + * > + * "size" is the uncompressed object size. Compressed size is not > + * cached (ie. raw data in a pack) but available via revindex. I am having a hard time understanding what "ie. raw data in a pack" is doing in that sentence. It is correct that compressed size is not cached; it does not even exist and the only way to know it is to compute it by reversing the .idx file (or actually uncompressing the compressed stream). Perhaps: Compressed size of the raw data for an object in a pack is not stored anywhere but is computed and made available when reverse .idx is made. > + * "hash" contains a path name hash which is used for sorting the > + * delta list and also during delta searching. Once prepare_pack() > + * returns it's no longer needed. Hmm, that suggests an interesting optimization opportunity ;-) > + * source pack info > + * ---------------- > + * The (in_pack, in_pack_offset, in_pack_header_size) tuple contains > + * the location of the object in the source pack, with or without > + * header. "with or without", meaning...? An object in the source pack may or may not have any in_pack_header, in which case in_pack_header_size is zero, or something? Not suggesting to rephrase (at least not yet), but trying to understand. > + * "type" and "in_pack_type" both describe object type. in_pack_type > + * may contain a delta type, while type is always the canonical type. > + * > + * deltas > + * ------ > + * Delta links (delta, delta_child and delta_sibling) are created > + * reflect that delta graph from the source pack then updated or added > + * during delta searching phase when we find better deltas. Isn't anything missing after "are created"? Perhaps "to"? > + * > + * delta_child and delta_sibling are last needed in > + * compute_write_order(). "delta" and "delta_size" must remain valid > + * at object writing phase in case the delta is not cached. True. I thought child and sibling are only needed during write order computing, so there may be an optimization opportunity there. > + * If a delta is cached in memory and is compressed delta_data points s/compressed delta_data/compressed, delta_data/; > + * to the data and z_delta_size contains the compressed size. If it's > + * uncompressed [1], z_delta_size must be zero. delta_size is always > + * the uncompressed size and must be valid even if the delta is not > + * cached. > + * > + * [1] during try_delta phase we don't bother with compressing because > + * the delta could be quickly replaced with a better one. > + */ > struct object_entry { > struct pack_idx_entry idx; > unsigned long size; /* uncompressed size */