Re: Squid: End-of-life for each release.

Eliezer Croitoru <eliezer@xxxxxxxxxxxx> · Fri, 06 Dec 2013 02:54:14 +0200

Hey Jakub,

About the advisory stuff you were talking about:
A proxy server will always be on the fragile edge between a 
vulnerability to work without any flows.
It's like a car that is "touching" the ground using the "wheels" and 
this is how it is, kind of basic not too smart physics.

If you have a system which is "well-behaved" or client that understand 
their doings.. it will be simpler to avoid any critical issues with the 
proxy regardless any advisory.

It's the same thing for the applications in both sides of the proxy.
This is what I also pointed at bug #7.
Summary of the above:
Changing response headers on a server big enough is probably a very bad 
choice as a daily task.
A change in this magnitude of the server operation *mode* should be 
planned to make sure what the change will be and what effect it would 
have on any of the related systems.
Once you can try to assume the effect on the main parts of the path 
(while not being exact science) the next step can be tested.
Squid by default do not support objects to be "changed" since a cache 
suppose to replace a response when needed.
For now cache avg object is 32KB which can be revalidated more 
frequently in various point on the path of the request.
If we do intend to build a cache with support for more then one year(365 
days) maximum storage time I assume that squid will need more redesign 
then people can even think.
For now the system design should consider the issues while trying to 
find a way that will allow a real world scenario of "switching headers 
structure" inside the file.

I can think about a way if utilizing Rock cache_dir structure to "store" 
objects in a way that can separate the actual content of the response 
from all the other parts of the request information.
For example: "GET /file.tar HTTP/1.1\nHost: large.db.cern.gov\n\n" is 
never stored in this form since it is being stored in a more compact way 
that do allow revalidation.
So we have the first TLV part of the UFS store file\object which is then 
dumped (in UFS) into one file in a binary form with strict structure.
If for example the Rock cache_dir uses(don't assume yet it is right to 
do) a set of data sets which are connected with a reference from each 
part to the next one ie: request Store-ID + timestamps + couple others + 
the response object itself..
Then simple writing of the "top" structure in another part of the DB and 
then referencing a new set of response structure(the TLVs at the 
requests) or identification to another data set which contains the 
object it will be simple enough without the overhead of playing with any 
of the more complex parts of code inside squid.
Adding this can add a very bad thing that can just burn a lot of 
hardware before their full utilization which I do not like.

A change in headers like this should be done Once and only Once per a 
very long time by design and should not by default (for now) be 
supported by squid due to the complexity of the issue.

It still doesn't apply that it cannot be done by squid project but 
leaving it the option to consider avoiding any future vulnerabilities.

If I would look at the idea in a basic\flat FS level objects I would try 
to do something like that:
One file at some very deep directory that contains the response *only* 
while naming the file with the basic URL.
Another file that will contain all the data from the cache proxy point 
of view and which contains the critical information on the 
request\response while allowing simple overwriting it.
Inside the second file(cache data) I will refer to a specific file 
outside of this file which can be a file with "link" structure or just 
to be a simple symlink to the object response.

It is not the smartest idea ever from the FS point of view but it will 
add an option to utilize the kernel FS cache in another level then it is 
now.
It will consume lots(did I say lots) of Inodes and other resources at 
runtime.

The main issue is that a not to deep research should be done on whether 
the cost of "removing" a full object and recreating it will cost 
more(money,I\O,DISKS,others which I do not think about, etc..) then 
caching it and changing only the basic one or two files of the object.
An example for a request storage:
file1: /some/far/dir/at/the/end/of/the/world/ID_of_cached_resouce.res
file2: /some/other/dir/with/lots/of/files/URL.meta
file3: 
/some/other/dir/with/lots/of/files/URL_sym_link_to_ID_of_cached_resouce.lnk

Note that the above is a very dirty sketch which has no accuracy of 
structure and just demonstrate couple ideas.

While for now it might seems weird but as a cache proxy of a Linux 
distributions with lots of 4GB ISO files(4 per each distro or even more) 
and line speed of 100Mbps as upstream it might be worth to waste couple 
more then even KB+CPU_CYCLES+other(lost it)+INODES to just allow these 
files to be cached on a larger machine\system with 10Gbps X 4 ++ 
upstream and lots of RAM or even SSD.

So separating the object\resource from the metadata that squid uses can 
help reduce the number of "writes" to disk which effects deeply on the 
disk and the whole system life cycle.

All the Bests,
Eliezer

On 06/12/13 00:17, Amos Jeffries wrote:

 I support a
distribution of squid2.7 for a few hundred universities & research labs
https://twiki.cern.ch/twiki/bin/view/Frontier/InstallSquid
I have been assuming that if a security problem were ever found with
2.7, there would still be a security advisory, and it would be announced
on this mailing list so I could distribute a patch.  Do you think that
is a reasonable assumption?  I also assume the security advisory would
say the only official thing to do is to upgrade to squid3, which is fine
as long as I could still patch 2.7.

Well I hesitate to say anything on this as it is hard to know. Squid-2.7
is benefiting from both its long history of stable use (low level of
vulnerabilities existing) and from being outdated (nasties have less
incentive to seek vulnerabilities).

I'm not even testing recent vulnerabilities against 2.x myself any more
beyond a quick manual check to see if the 2.x and 3.x code is similar in
the particular area (SQUID-2012:1, SQUID-2011:3).

Most of the recent vulnerabilities have been in code which was changed
between the versions and somebody got something wrong (SQUID-2013:1,
SQUID-2013:3) or uncovered a code path protecting against some hidden
issue elsewhere (SQUID-2013:2). A few are long standing design
vulnerabilities which we finally re-designed squid-3.x in a way that
allowed fixing it - so squid-2 will definitely never be fixed
(SQUID-2011:1, SQUID-2011:2).

Amos