On 26/11/2013 10:13 a.m., Ghassan Gharabli wrote: > Hi, > > I have built a PHP script to cache HTTP 1.X 206 Partial Content like > "WindowsUpdates" & Allow seeking through Youtube & many websites . > Ah. So you have written your own HTTP caching proxy in PHP. Well done. Did you read RFC 2616 several times? your script is expected to to obey all the MUST conditions and clauses in there discussing "proxy" or "cache". NOTE: the easy way to do this is to upgrade your Squid to the current series and use ACLs on the range_offset_limit directive. That way Squid will convert Range requests to normal fetch requests and cache the object before sending the requested pieces of it back to the client. http://www.squid-cache.org/Doc/config/range_offset_limit/ > I am willing to move from PHP to C++ hopefully after a while. > > The script is almost finished , but I have several question, I have no > idea if I should always grab the HTTP Response Headers and send them > back to the borwsers. The response headers you get when receiving the object are meta data describing that object AND the transaction used to fetch it AND the network conditions/pathway used to fetch it. The cachs job is to store those along with the object itself and deliver only the relevant headers when delivering a HIT. > > 1) Does Squid still grab the "HTTP Response Headers", even if the > object is already in cache or Squid has already a cached copy of the > HTTP Response header . If Squid caches HTTP Response Headers then how > do you deal with HTTP CODE 302 if the object is already cached . I am > asking this question because I have already seen most websites use > same extensions such as .FLV including Location Header. Yes. All proxies on the path are expected to relay the end-to-end headers, drop the hop-by-hop headers, and MUST update/generate the feature negotiation and state information headers to match its capabilities in each direction. > > 2) Do you also use mime.conf to send the Content-Type to the browser > in case of FTP/HTTP or only FTP ? Only FTP and Gopher *if* Squid is translating from the native FTP/Gopher connection to HTTP. HTTP and protocols relayed using HTTP message format are expected to supply the correct header. > > 3) Does squid compare the length of the local cached copy with the > remote file if you already have the object file or you use > refresh_pattern?. Content-Length is a declaration of how many payload bytes are following the response headers. It has no relation to the servers object except in the special case where the entire object is being delivered as payload without any encoding. > > 4) What happens if the user modies a refresh_pattern to cache an > object, for example .xml which does not have [Content-Length] header. > Do you still save it, or would you search for the ignore-headers used > to force caching the object and what happens if the cached copy > expires , do you still refresh the copy even if there is no > Content-Length header?. refresh_pattern does not cause caching of any objects. What it does is tell Squid how long an object is valid for before it needs to be revalidated or replaced. In some situations this can affect caching decision, in most it only affects expiry. Objects without content-length are handled differently by HTTP/1.0 and HTTP/1.1 software. When either end of the connection is advertising HTTP/1.0 the sending software is expected to terminate the TCP connection on completion of the payload block. When both ends advertise HTTP/1.1 the sending software is expected to use Transfer-Encoding:chunked in order to keep the connection alive unless the client sent Connection:close. Doing the HTTP/1.0 behaviour is also acceptible if both ends are HTTP/1.1, but causes a performance loss due to churn and setup costs of TCP. > > I am really confused with this issue , because I am always getting a > headers list from the internet and I send them back to the browser > (using PHP and Apache) even if the object is in cache. I am really confused about what you are describing here. You should only get a headers list from the upstream server if you have contacted one. You say the script is sending to the browser. This is not true at the HTTP transaction level. The script sends to Apache, Apache sends to whichever software requested from it. What is the order you chained the Browser, Apache and Squid ? Browser -> Squid -> Apache -> Script -> Origin server or, Browser -> Apache -> Script -> Squid -> Origin server Amos