2012/8/12 Amos Jeffries <squid3@xxxxxxxxxxxxx>: > On 11/08/2012 10:21 p.m., Jack Bates wrote: >> >> On 11/08/12 12:30 AM, Amos Jeffries wrote: >>> >>> On 11/08/2012 7:22 p.m., Jack Bates wrote: >>>> >>>> I am interested in intercepting content as it is written to the cache, >>>> and computing a digest from the content. Do you know if this can be >>>> done in some kind of add on, or would it require a change to the core? >>> >>> >>> What type of digest and to what purpose? >> >> >> I was thinking of using OpenSSL >> SHA256_Init()/SHA256_Update()/SHA256_Final(). The purpose I have in mind is >> to detect identical content at different URLs >> >> Given a response with a "Location: ..." header and a "Digest: SHA-256=..." >> header (such as from MirrorBrain), if the URL in the "Location: ..." header >> is not already cached but the "Digest: SHA-256=..." header matches the >> content at some other URL that is already cached, then I want to update the >> "Location: ..." header with the cached URL. I think this should redirect >> clients to mirrors that are already cached > > > Small problem there. The digest is not calculated/known until the object is > finished arriving. By then it is too late to attach new headers. And way too > late to decide whether to ask that source for it. > Agree. Multiple different splashing headers with same content is really hard to store. Split headers/contents and store them respectively may works, and the headers should store as on-to-many mapping.