Re: Automatic StoreID ?

Nikolai Gorchilov <niki@xxxxxxxx> · Tue, 11 Mar 2014 21:18:16 +0200

On Tue, Mar 11, 2014 at 6:10 PM, Alex Rousskov
<rousskov@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
> On 03/11/2014 08:05 AM, Omid Kosari wrote:
>
>> Is it possible for Squid to automatically find every similar object based on
>> something like md5 of objects and serve them to clients without need custom
>> DB ?
>
> No, because clients do not tell Squid what checksum they are looking
> for. They only give Squid a URL of the object (essentially). Thus, to
> satisfy request for URL A with an already cached response to request B,
> Squid needs to map URL A to URL B, and that is what Store ID does.
> Response B checksum is irrelevant until you do the URL mapping.
>
>
>> I know it is complicated task but i think the Utopia of a cache should be
>> that we just have one instance of an object in all Squid Farm
>> (automatically) and serve it as different URLs.
>
> It is possible to avoid caching duplicate content, but that allows you
> to handle cache hits more efficiently. It does not help with cache
> misses (when the URL requested by the client has not been seen before).

Actually, two commercial vendors - PeerApp and ThunderCache - claim
their products doesn't use urls to identify the objects, thus they
don't have to maintain StoreID-like de-duplication database manually.

Any ideas how do they do it?

> If content publishers start publishing content checksums and browsers
> automatically add those checksums to requests, then you would have the
> Utopia you dream about :-). This will not happen while content
> publishers benefit from getting client requests more than they suffer
> from serving those requests.

Unfortunately....

Niki