Search squid archive

Re: Squid-2, Squid-3, roadmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ideally, you'd avoid locking as much as possible; e.g., have a pool of threads for disk access (as now with aufs), a pool for header parsing, a pool for forward requests, and so on. I don't think it's a good idea at all to re-architect squid into a thread-per-connection model or anything; just find the places that are bottlenecks and allow some parallelism, keeping the number of threads low.

(says he, the non-threads programmer. I'm not *that* crazy...)

Redirectors and other helpers are already able to run on other CPUs, so that's a non-issue.

Cheers,


On 07/03/2008, at 3:05 AM, Adrian Chadd wrote:

Well, the way I'd approach it is to first get an idea of how to throw
things into 'threads', and probably draft and craft a basic event loop
and submission queue for "stuff" to happen across threads.

Then "Squid" can run as one thread, and CPU intensive stuff can happen
via message queues to other threads.

Eventually my gut feeling (reliable as it is) tells me that the most
efficient and scalable way of doing this is to create a lightweight
"squid" that handles just client and server-side interactions, with storage,
logging, ACLs and other stuff happening in other threads, and then
create multiple "squid" threads that run almost indepedently from one another. This would avoid all of the crazy fine-grain locking that traditionally is done
to take a non-threaded app into the threaded world. I really think
avoiding that is a very good idea.

Oh, and no, there's nothing in Squid right now that "jumps out" save perhaps pushing regular expression lookups into a seperate thread or threads. But really, if you're going to do that then you're better off pushing a large part of the ACL subsystem into seperate threads and have the main code submit lookup requests there. Of course, what would be interesting there is benchmarking how effective it'd be to batch things like ACL lookups in "groups" to try and get some cache coherency effects going, rather than the current tendency for Squid to process a request as far as it can go before something blocking comes along,
blowing much of the CPU cache away as possible in the meantime.

But really, the big problem is to spend some time looking at efficient
ways of parallelising network applications and what works well on current hardware/OSes. I'm just playing around with a simple TCP proxy right now which I'll use to experiment with "better" ways of doing stuff reasonably portably. I can then set this as the "upper bounds" for how well stuff may perform, and can then spend some time looking at how to tune things like parallelism, IO handling, memory allocation and event notification. Then I can spend some more time looking at batching operations such as IO, ACL lookups, etc - see if better use of CPU caches can be made and also see if doing all the system read/write syscalls in one hit per loop rather than spread out throughout the program execution
makes any difference.

Its really hard to benchmark -these- inside Squid, and thus its very difficult to figure out how to make better use of current hardware. _This_ is the "First Problem"
to solve.

Of course, all of this depends entirely on whether I get enough clients to start funding some of this work, and how much I can dedicate to this over my Semantics, Experimental Methods and Behavioural Neuropsychology classes this semester. :)




Adrian
(Sleep? Hah!)

On Thu, Mar 06, 2008, Chris Woodfield wrote:
I'll readily admit that I Am Not A Developer, but I'm wondering if
this could be something that could be worked incrementally - finding
easy-to-cleave-off subsystems that can be moved to separate threads
similarly to how asyncio was. The most obvious one I can think of is
the front-end client/server network socket communication code; next
would be logging. Are there any other subsystems that jump out as
"independent" enough to do this in the existing code base?

-C

On Mar 6, 2008, at 4:17 AM, Adrian Chadd wrote:

On Wed, Mar 05, 2008, Michael Puckett wrote:
Mark Nottingham wrote:

A killer app for -3 would be multi-core support (and the perf
advantages that it would bring), or something else that the
re-architecture makes possible that isn't easy in -2. AIUI, though,
that isn't the case; i.e., -3 doesn't make this significantly
easier.
Absolutely THE killer app for either -2 or -3. The fact that multi-
core
processors are now the defacto standard in any box makes this more
important by the day IMHO. Being able to do sustained IO across
multiple
Gb NICs will absolutely require it. This is the single biggest
performance enhancement that could be implemented. So where does
multi-core support fall on either roadmap?

12 months away on my draft Squid-2 roadmap, if there was enough
commercial
interest. Thing is, the Squid internals are very horrible for SMP
(both 2 and 3)
and the list of stuff that I've put into the squid-2 roadmap is what
I think
is the minimum amount of work required before really starting to
take advantage
of multiple cores.




Adrian

--
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial
Squid Support -
- $25/pm entry-level VPSes w/ capped bandwidth charges available in
WA -


--
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support - - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -

--
Mark Nottingham       mnot@xxxxxxxxxxxxx



[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux