Re: [PATCH v2] Perform cheaper connectivity check when pack is used as medium

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 3, 2012 at 1:59 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>>> I also suspect that more than trivial amount of computation is needed to
>>> determine if a given object exists only in a single pack, so the end
>>> result may not be that much cheaper than the current --verify-object code.
>>
>> Objects can exist in multiple packs right now if they are base
>> objects. I'm not sure why you need to check for object existence in a
>> single pack.
>
> What I meant to say was not "it is in this pack and nowhere else", but
> about a check like this:
>
>        static void finish_object(struct object *obj, ...)
>        {
>                struct packed_git *fetched_pack = cb_data->fetched_pack;
>
>                if (obj->type == OBJ_BLOB && !has_sha1_file(obj->sha1))
>                        die("missing");
>                if (!info->revs->verify_objects)
>                        return;
>                if (find_pack_entry_one(obj->sha1, fetched_pack))
>                        return; /* we just fetched and ran index-pack on it */
>                if (!obj->parsed && obj->type != OBJ_COMMIT)
>                        parse_object(obj->sha1);
>        }
>
> I think this is the kind of "passing identity down the callchain" both of
> us have in mind.  I was trying to say that find_pack_entry() may not be
> trivially cheap.  But probably I am being worried too much.

This is after index-pack is run and .idx file created, I think
determining object's storage type should be relatively cheap compared
to rehashing. We'll know when I update the patch.

> But now you brought it up, I think we may also need to worry about a
> corrupt pre-existing loose blob object.  In general, we tend to always
> favor reading objects from packs over loose objects, but I do not know
> offhand what repacking would do when there are two places it can read the
> same object from (it should be allowed to pick whichever is easier to
> read).

.. which should be pack for pack-objects/repack because they can do a
straight copy from pack to pack. --no-reuse-objects delegates object
reading back to read_sha1_file(), and this one prefers packs too.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]