Re: [PATCH v6 6/6] convert: add "status=delayed" to filter process protocol

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Lars Schneider <larsxschneider@xxxxxxxxx> writes:

> Maybe this?
>     [...] If Git sends this command, then the
>     filter is expected to return a list of pathnames representing blobs 
>     that have been delayed earlier and are now available. [...]

OK.

>>> +by a "success" status that is also terminated with a flush packet. If
>>> +no blobs for the delayed paths are available, yet, then the filter is
>>> +expected to block the response until at least one blob becomes
>>> +available.
>> 
>> Ahh, this is better, at least you use "the delayed paths".
>> 
>> What if the result never gets available (e.g. resulted in an error)?
>
> As soon as the filter responds with an empty list, Git stops asking.
> All blobs that Git has not received at this point are considered an
> error.
>
> Should I mention that explicitly?

Otherwise I wouldn't have wondered "what if".

>> I am wondering whose responsibility it will be to deal with a path
>> "list-available" reports that are *not* asked by Git or Git got no
>> "delayed" response.  The list subtraction done by the caller is
>> probably the logical place to do so.
>
> Correct. Git (the caller) will notice that not all "delayed" paths
> are listed by the filter and throw an error at the end.

I am wondering about the other case.  Git didn't ask for a path to
be filtered at all, but the filter sneaked in a path that happens to
in the index in its response---Git should at least ignore it, and
better yet, diagnose it as an error in the filter process.

>>> +				filter->string = "";
>>> +				continue;
>>> +			}
>>> +
>>> +			/* In dco->paths we store a list of all delayed paths.
>>> +			   The filter just send us a list of available paths.
>>> +			   Remove them from the list.
>>> +			*/
>>> +			filter_string_list(&dco->paths, 0,
>>> +				&remove_available_paths, &available_paths);
>> 
>> We first remove from the outstanding request list (dco->paths) what
>> are now ready...
>> 
>>> +			for_each_string_list_item(path, &available_paths) {
>> 
>> ...and go over those paths that are now ready.
>> 
>>> +				struct cache_entry* ce = index_file_exists(
>>> +					state->istate, path->string,
>>> +					strlen(path->string), 0);
>>> +				assert(dco->state == CE_RETRY);
>>> +				errs |= (ce ? checkout_entry(ce, state, NULL) : 1);
>>> +			}
>> 
>> But we never checked if the contents of this available_paths list is
>> a subset of the set of paths we originally asked the external
>> process to filter.
>
> Correct.
>
>>  This will allow the process to overwrite any
>> random path that is not even involved in the checkout.
>
> No, not "any random path". Only paths that are part of the checkout.

Isn't it "any path that index_file_exists() returns a CE for".  Did
you filter out elements in available_paths that weren't part of
dco->paths?  I thought the filter-string-list you have are for the
other way around (which is necessary to keep track of work to be
done, but that filtering does not help rejecting rogue responses at
all).

> There are three cases:
>
> (1) available_path is a path that was delayed before (= happy case!)
> (2) available_path is a path that was not delayed before, 
>     but filtered (= no problem, as filtering is a idempotent operation)
> (3) available_path is a path that was neither delayed nor filtered
>     before (= if the filter returns the blob as-is then this would
>     be no problem. otherwise we would indeed have a screwed checkout)
>
> Case 3 might introduce a problem if the filter is buggy.  

> Would you be OK with this check to catch case 3?

I'd be very suspicious about anything you would do only with .nr
field, without filtering the other way around.  After all, we may
have asked it for 3 paths to be filtered, and it may have answered
with its own 3 different paths.

>     dco_path_count = dco->paths.nr;
>     filter_string_list(&dco->paths, 0,
>         &remove_available_paths, &available_paths);
>
>     if (dco_path_count - dco->paths.nr != available_paths.nr) {
>         /* The filter responded with entries that have not
>          * been delay earlier. Do not ask the filter
>          * for available blobs, again, as the filter is
>          * likely buggy. This will generate an error at 
>          * the end as some files are not filtered properly.
>          */
>         filter->string = "";
>         error(_("The external filter '%s' responded with "
>             "available blobs which have not been delayed "
>             "earlier."), filter->string);
>         continue;
>     }
>
>
>>> +		}
>>> +		string_list_remove_empty_items(&dco->filters, 0);
>>> +	}
>>> +	string_list_clear(&dco->filters, 0);
>>> +
>>> +	/* At this point we should not have any delayed paths anymore. */
>>> +	errs |= dco->paths.nr;
>>> +	for_each_string_list_item(path, &dco->paths) {
>>> +		warning(_("%s was not filtered properly."), path->string);
>>> +	}
>>> +	string_list_clear(&dco->paths, 0);
>> 
>> And "list-available" that says "path X is ready" when we never asked
>> for X gets away free without detected as a bug, either.
>
> With the addition above it would!
>
>
> Thanks for the review :-)
>
> - Lars



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux