>Actually, two commercial vendors - PeerApp and ThunderCache - claim their products doesn't use urls to identify the objects, thus they don't have to maintain StoreID-like de-duplication database manually. Any ideas how do they do it? < Instead of first mapping the URL to a memory-resident table, keeping pointers (file-id, bucket no.) to the real location of the object on disk, a hash-value, derived from the URL could directly be used to designate the storage location on disk, avoiding the translation table, squid uses. This is the principle of every hashed table in a fast database system. Drawback is, you have to deal with "collisions" on the disk and "overflows": hashes for different URLs point to same storage location on disk. Different solutions for this problem available, though (chaining, sequential storage, secondary storage area etc.). And you have to manage variable sized "buckets", the storage locations, hashing points to. Positive consequence: No rebuild of the in-memory-table necessary, as there is none. Avoids the time-comsuning rebuild of rock-storage-table from disk. I can imagine, that because of historical reasons (much simpler to implement), squid uses the translation-table instead of direct hashing, whereas Thundercache etc. can rely on some low-level DB-system, having direct hashing "ready to be used". -- View this message in context: http://squid-web-proxy-cache.1019090.n4.nabble.com/Automatic-StoreID-tp4665140p4665198.html Sent from the Squid - Users mailing list archive at Nabble.com.