Search squid archive

Re: Questions in Squid source code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 29 Dec 2009 17:37:42 -0800, Manjusha Maddala
<mmaddala25@xxxxxxxxxx> wrote:
> Hi all, 
> 
> I'm working with Squid-2.6 and right now stuck on a bunch of questions.
> Would appreciate a word from the Squid experts.

Then you want to be mailing the squid-dev mailing list where the experts
are. This is a place for _users_ to help each other. Emphasis on
configuration file problems and how to setup things.

> 
> 1. What is SwapDir?

A cache directory (cache_dir config line). I think.

> Is that the in-memory representation of the disk
> cache? What does the in-memory representation of the disk cache look
> like - does it follow the same format as the swap.state file?

No.

> 
> 2. What is StoreEntry? 

An HTTP object. The in-memory representation of a disk file: storage
details, meta data, HTTP headers, binary body data.

> 
> 3. In squid/src/structs.h,
> 
>     what do each of the entries in the below structure symbolize?	
> 
>     struct _cacheSwap {
>         SwapDir *swapDirs;

An array of cache_dir lines

>         int n_allocated;

Maybe the number which exist in squid.conf.

>         int n_configured;

Maybe the number which have been completely configured/setup/whatever.

>     } cacheSwap;
> 
>     when/where do they get initialized?

By something in the configuration file parser.
look for a function parse_X() where X is the TYPE: line in src/cf.data.pre
assigned to the cache_dir option.

> 	
> 4. Each time squid -k rotate is done, I notice a new swap.state file
> gets added along with a 0 byte swap.state.last-clean file. How is the
> new swap.state file built? Is the in-memory hashtable/map dumped into
> this file during rotate or is it built by crawling all the directories
> in the disk cache and fetching the meta data of each file? 

Both. The swap.state is a re-formatted journal dump of the in-memory cache
index generated at rotate time.
The in-memory cache index is built from 1) loading a previous swap.state
file (CLEAN load), 2) scanning the disk cache item-by-item (DIRTY load),
and 3) adding/removing entries during live operation.

> 
> 5. Once the swap.state file is built, it keeps growing until the next
> periodic squid rotate is kicked off. What are these new entries that get
> appended to swap.state? I'm guessing each time a new webpage gets
> cached, 
> 5.1) the in-memory table gets updated with the meta data for the new URI
> 5.2) one entry is made in store.log with a "SWAPOUT" tag
> 5.3) one entry is made in swap.state with the meta data for the new URI
> 
> Somewhere in between the two squid rotate jobs, the cache replacement
> thread comes in and evicts the least recently used pages. The memory
> hashtable gets updated accordingly, *but* the swap.state file doesn't.
> Hence, over time swap.state file grows and needs to be synced up with
> the memory table. 

swap.state is a _journal_. There is a removal record added to it when
something gets removed. A file meta record when something gets added. both
when something gets changed.

> 
> Did I get it right?
> 
> 6. Is there any utility to read the swap.state file?

Yes. Lookup the third-party squidpurge tool.

> 
> 7. swap.state file is maintained for loading the in-memory hashtable at
> squid startup. When else is this file used?

All the update times you thought of...

> 
> 8. A high-level pseudo code for the request processing algorithm as I
> understand:
> 
> 	- Squid receives a GET request for URL
> 	- Computes a hash for the URL and uses it as a key to pull the record
> from its internal memory representation of the meta-data of all files on
> the disk cache
> 	- If a matching record is found, the refresh_pattern rules are applied
> to determine if the content is fresh or stale and a TCP_HIT or
> TCP_REFRESH_HIT/TCP_REFRESH_MISS get logged respectively
> 	- If no record is found, its a TCP_MISS
> 
> Have I missed something? 	
> 
> 
> Thanks.
> 
> CONFIDENTIALITY NOTICE 
> =======================
> This email message and any attachments are for the exclusive use of the
> intended recipient(s)

NOTE: none of the intended recipients have yet received this email.
Instead it went to a large group of administrators, few of whom can help.

Amos


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux