Re: Re: How to use tcp_outgoing_address with cache_peer

Alex Domoradov <alex.hha@xxxxxxxxx> · Mon, 13 May 2013 08:46:50 +0300

If I understand you correctly, if the drive is failed all I need is to
remove failed disk, comment cache_dir (
with corresponding drive) and restart squid?

On Mon, May 13, 2013 at 2:03 AM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote:
> On 13/05/2013 7:27 a.m., Alex Domoradov wrote:
>>
>> On Wed, May 1, 2013 at 12:42 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx>
>> wrote:
>>>
>>> On 1/05/2013 10:21 a.m., babajaga wrote:
>>>>
>>>> Amos,
>>>>
>>>> although a bit off topic:
>>>>
>>>>> It does not work the way you seem to think. 2x 200GB cache_dir entries
>>>>
>>>> have just as much space as 1x 400GB. Using two cache_dir allows Squid to
>>>> balance teh I/O loading on teh disks while simultaenously removing all
>>>> processing overheads from RAID. <
>>>>
>>>> Am I correct in the following:
>>>> The selection of one of the 2 cache_dirs is not deterministic for same
>>>> URL
>>>> at different times, both for round-robin or least load.
>>>> Which might have the consequence of generating a MISS, although the
>>>> object
>>>> ist cached in the other cache_dir.
>>>> Or, in other words: There is the finite possibility, that a cached
>>>> object
>>>> is
>>>> stored in one cache_dir, and because of the result of the selection
>>>> algo,
>>>> when the object should be fetched,
>>>> the decision to check the wrong cache_dir generates a MISS.
>>>> In case, this is correct, one 400GB cache would have a higher HIT rate
>>>> per
>>>> se.  AND, it would avoid double caching, therefore increasing effectice
>>>> cache space, resulting in an increase in HIT rate even more.
>>>>
>>>> So, having one JBOD  instead of multiple cache_dirs (one cache_dir per
>>>> disk)
>>>> would result in better performance, assuming even distribution of
>>>> (hashed)
>>>> URLs.
>>>> Parallel access to the disks in the JBOD is handled on lower level,
>>>> instead
>>>> with multiple aufs, so this should not create a real handicap.
>>>
>>>
>>> You are not.
>>>
>>> Your whole chain of logic above depends on the storage areas (cache_dir)
>>> being separate entities. This is a false assumption. They are only
>>> separate
>>> to the operating system. They are merged into a collective "cache" index
>>> model in Squid memory - a single lookup to this unified store indexing
>>> system finds the object no matter where it is (disk or local memory) with
>>> the same HIT/MISS result based on whether it exists *anywhere* in at
>>> least
>>> one of the storage areas.
>>>
>>> It takes the same amount of time to search through N index entries for
>>> one
>>> giant cache_dir as it does for the same N index entries for M cache_dir.
>>> The
>>> difference comes when Squid is aware of the individual disk I/O loading
>>> and
>>> sizes it can calculate accurate loading values to optimize read/write
>>> latency on individual disks.
>>>
>>> Amos
>>>
>>>
>> And what would be if we have 2 cache_dir
>>
>> cache_dir aufs /var/spool/squid/ssd1 200000 16 256
>> cache_dir aufs /var/spool/squid/ssd2 200000 16 256
>>
>> /var/spool/squid/ssd1 - /dev/sda
>> /var/spool/squid/ssd2 - /dev/sdb
>>
>> User1 download BIG psd file and squid save file on /dev/sda (ssd1).
>> Then sda is failed and user2 try to download the same file. What would
>> be in that situation? Does squid download file again and place file on
>> /dev/sdb and then rebuild "cache" index in memory?
>
>
> Unfortunately when a UFS cache_dir dies Squid halts. This happens whether or
> not RAID is used. The exception being RAID-1 (but not RAID-10) which
> provides a bit more protection than Squid at present.
>
> With multiple directories though you are in a position to quickly remove the
> dead cache_dir and restart Squid with the second cache_dir while you work on
> a fix, with RAID 0, 10, or 5 you are forced to rebuild the disk structure
> while Squid is either offline or running without *any* disk cache.
>
> Amos