Re: Re: ICP and HTCP and StoreID

Nikolai Gorchilov <niki@xxxxxxxx> · Fri, 14 Feb 2014 03:20:41 +0200

On Fri, Feb 14, 2014 at 2:04 AM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote:
> On 2014-02-14 09:04, Alex Rousskov wrote:
>>
>> On 02/13/2014 05:11 AM, Nikolai Gorchilov wrote:
>>
>>> I'd suggest first to review all possible StoreID use cases involving
>>> cache peers before proceeding further.
>>>
>>> Let's define A as originating proxy and B - as the next hop proxy in
>>> the request forwarding chain. UDP is alias for both ICP or HTCP query,
>>> while TCP is synonym of the following HTTP request.
>>>
>>> Here are all valid usage scenarios I could think of:
>>> 1. A & B use same StoreID rewiring logic
>>> - No StoreID processing for incoming UDP on B is necessary
>>> - UDP request uses StoreID
>>> - TCP request uses URL
>>> 2. A & B use different StoreID rewriting logic
>>> - StoreID processing on incoming UDP on B
>>> - UDP request uses URL
>>> - TCP request uses URL
>>> 3. A with StoreID enabled, B - disabled
>>> - UDP request uses URL
>>> - TCP request uses URL
>>> 4. A with StoreIID disabled, B - enabled
>>> - StoreID processing on incoming UDP on B
>>> - UDP request uses URL
>>> - TCP request uses URL
>>>
>>> In order to support all of the above we need the following two config
>>> options:
>>> - configuration switch to enable or disable StoreID processing on
>>> incoming UDP
>>> - cache_peer option to enable/disable querying the respective peer
>>> using  StoreID instead of URL
>>
>>
>>
>>> If you see any rifts in the above logic, please say.
>>
>>
>> I question the value of supporting the implied "no StoreID processing"
>> optimization above. AFAICT, if Squid always uses URLs for anything
>> outside internal storage, everything would work correctly and all use
>> cases will be supported well, without any additional options.
>>
>> If somebody wants to extend ICP/HTCP to include StoreId in the request
>> (as an optional additional field), they may do so, but that optional
>> optimization does not change the overall design principle: StoreId for
>> the internal storage; URL for everything else.
>
>
> Exactly.
>
>
> Keeping two distinct cache_peer internal index representations in-sync with
> regards to how some helper service is producing the IDs is not as trivial a
> job as implied by the proposal.
>  Consider the process of upgrading either Squid or the helper on server A
> simply *10 seconds* earlier than server B. For that period one of the
> services may be pushing garbage cache IDs into the other. In that same time
> the latest Squid could process several thousand requests - not exactly a
> trivial amount of cache churn.

UDP requests doesn't push anything. They just check if the peer has an
object. If wrong (not in sync) cache ID is used - not a big deal.
UDP_MISS response will be generated. And the originating peer will
decide what to do next.

> Also, the connection between those peers is not necessarily a direct 1-hop
> connection. It may involve any kind of HTTP interception software
> (firewalls, deep packet inspectors, etc) overlooked by even the most well
> intended administrator.

We're talking ICP/HTCP here. HTTP request shall always go with URL....

I really don't understand your logic. Both you and Alex seem to be OK
with the fact Squid is using StoreID for during HTTP with cache peers
(let's call it "known limitation"), but using StoreID for ICP/HTCP
queries is considered a bug that needs a fix.

For me it's quite the opposite - StoreID over HTTP shall be fixed
ASAP, StoreID over ICP/HTCP shall be considered "known limitation".

Best,
Niki