Re: Generell Squid setup

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Fri, 21 Sep 2012 20:35:00 +1200

On 21/09/2012 4:26 a.m., Farkas H wrote:
Hi Amos,

thank you for your response.
I think I didn't explain the problem correctly. Let me mention that
the http-Post request contains
- one or two URLs,
- name and parameters of one mathematical function from a set of
server functions,
- mime-type of the response.

The client sends a http-Post request to the server

The server ...
(1) fetches the data (xml, up to 100 MB) via http-Get using the
reference which is embedded in the client http-Post request,
(2) computes a complete new dataset (xml, up to 100 MB) using the
fetched data and the requested function and parameters,
(3) converts the dataset to the requested mime-type,
(4) returns the resulting dataset as a response of the http-Post
request to the client.

Aha. So you are *not* transforming a POST into a GET as earlier stated.

You are generating an entirely new resource (at the web service and URLs 
POST'ed too), from what happens to be remote data sources contacted via GET.

Because the front end requests are POST there is no caching of them by 
Squid. The web service is free to run its own cache to optimize 
computation responses, but that is outside of HTTP.

The back end requests being GET can be freely cached by a Squid right 
next to the calculator web service. Erasing the latency to original 
sources and de-duplicating repeat requests by the Web Service.

In theory the body response from a POST with Content-Location: header 
could be cached under the Content-Location URL for later GET requests to 
it, but Squid does not yet support that and it sounds like your system 
the clients would not re-request using GET anyway. (Sponsorship welcome 
if you want to go that way regardless).

If we would implement a simple server cache the process would be like this:
(- client sends request via http-Post to the server)
- server generates a unique key of the client http-Post request,
because the URL of the http-Post request is always the same,
- server checks if the unique key / filename is in the cache; no:
normal process,
- server checks via http-Head (origin server) if the cached data is
fresh: no normal process,
- server sends the cached data associated to the unique key / filename
as response of the http-Post request to the client; end,
- normal process: server fetches the data via http-Get; server
computes the result dataset using the fetched data and the requested
function and parameters; server stores data in cache; server sends the
result data as response of the http-Post request to the client; end.

I wonder if Squid could help us to cache the data accosiated to the
client http-Post request. My idea is to generate a unique key of the
http-Post request to make it cacheable.

POST can be returned with Content-Length and ETag unique location and 
identifier. Which technically make the response body cacheable, but only 
when fetched from the Content-Location with GET requests later. All POST 
requests still are not able to be responded with a HIT, only with MISS.

Amos

Thanks so much for your response.
Farkas

On 8 September 2012 04:48, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote:
On 7/09/2012 9:39 a.m., Farkas H wrote:
Hi Amos,

thanks for your response.
I modified the web application. Now we have the following infrastructure.
client --> http-Post [embedded http-Get] --> Server / web application
--> http-Get --> Squid -> Servers (-> Squid -> Server / web
application -> client)
Advantage: The Server / web application doesn't have to request data
from the remote servers if it's in the Squid cache.

Additionally I want to cache the http-Post requests.
client --> http-Post --> Squid --> Server / web apllication /
processing the response (-> Squid -> client)

Requests body is not cacheable in HTTP.

The body content is the state to be changed at end-server resource in the
URL. Caching it is meaningless, since next time the state needs to be
changed you CANNOT simply reply from a middleware cache saying "server state
now changed" without passing any of those details to the server.

Additionally, your translation service is re-writing the POST into GET
requests so the POST ceases to exist at your gateway service. There is
nothing to cache.

The idea: We modify the header of the http-post request to make it unique.
The information whether Squid has a stored response to the modified
request (true or false) should be added to the request / should be
forwarded to the destination server. There are two possibilities.
(1) The modified request is not stored in Squid (new request).
(2) The modified request is stored in Squid. We don't know yet if the
data is still fresh.
The request should be forwarded in both(!) possibilities, (1) and (2),
to the destination server.
Is that possible with Squid?

Of course. That is dependent on the Cache-Control: headers the server
supplies to Squid with its responses and applies to each response
independent of anythign else.

Send "Cache-Control: must-revalidate" and Squid will query the server for
each request asking if there are updates.

Amos