On Tue, Mar 11, 2014 at 6:10 PM, Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx> wrote: > On 03/11/2014 08:05 AM, Omid Kosari wrote: > >> Is it possible for Squid to automatically find every similar object based on >> something like md5 of objects and serve them to clients without need custom >> DB ? > > No, because clients do not tell Squid what checksum they are looking > for. They only give Squid a URL of the object (essentially). Thus, to > satisfy request for URL A with an already cached response to request B, > Squid needs to map URL A to URL B, and that is what Store ID does. > Response B checksum is irrelevant until you do the URL mapping. > > >> I know it is complicated task but i think the Utopia of a cache should be >> that we just have one instance of an object in all Squid Farm >> (automatically) and serve it as different URLs. > > It is possible to avoid caching duplicate content, but that allows you > to handle cache hits more efficiently. It does not help with cache > misses (when the URL requested by the client has not been seen before). Actually, two commercial vendors - PeerApp and ThunderCache - claim their products doesn't use urls to identify the objects, thus they don't have to maintain StoreID-like de-duplication database manually. Any ideas how do they do it? > If content publishers start publishing content checksums and browsers > automatically add those checksums to requests, then you would have the > Utopia you dream about :-). This will not happen while content > publishers benefit from getting client requests more than they suffer > from serving those requests. Unfortunately.... Niki