If I understand you correctly, if the drive is failed all I need is to remove failed disk, comment cache_dir ( with corresponding drive) and restart squid? On Mon, May 13, 2013 at 2:03 AM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote: > On 13/05/2013 7:27 a.m., Alex Domoradov wrote: >> >> On Wed, May 1, 2013 at 12:42 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx> >> wrote: >>> >>> On 1/05/2013 10:21 a.m., babajaga wrote: >>>> >>>> Amos, >>>> >>>> although a bit off topic: >>>> >>>>> It does not work the way you seem to think. 2x 200GB cache_dir entries >>>> >>>> have just as much space as 1x 400GB. Using two cache_dir allows Squid to >>>> balance teh I/O loading on teh disks while simultaenously removing all >>>> processing overheads from RAID. < >>>> >>>> Am I correct in the following: >>>> The selection of one of the 2 cache_dirs is not deterministic for same >>>> URL >>>> at different times, both for round-robin or least load. >>>> Which might have the consequence of generating a MISS, although the >>>> object >>>> ist cached in the other cache_dir. >>>> Or, in other words: There is the finite possibility, that a cached >>>> object >>>> is >>>> stored in one cache_dir, and because of the result of the selection >>>> algo, >>>> when the object should be fetched, >>>> the decision to check the wrong cache_dir generates a MISS. >>>> In case, this is correct, one 400GB cache would have a higher HIT rate >>>> per >>>> se. AND, it would avoid double caching, therefore increasing effectice >>>> cache space, resulting in an increase in HIT rate even more. >>>> >>>> So, having one JBOD instead of multiple cache_dirs (one cache_dir per >>>> disk) >>>> would result in better performance, assuming even distribution of >>>> (hashed) >>>> URLs. >>>> Parallel access to the disks in the JBOD is handled on lower level, >>>> instead >>>> with multiple aufs, so this should not create a real handicap. >>> >>> >>> You are not. >>> >>> Your whole chain of logic above depends on the storage areas (cache_dir) >>> being separate entities. This is a false assumption. They are only >>> separate >>> to the operating system. They are merged into a collective "cache" index >>> model in Squid memory - a single lookup to this unified store indexing >>> system finds the object no matter where it is (disk or local memory) with >>> the same HIT/MISS result based on whether it exists *anywhere* in at >>> least >>> one of the storage areas. >>> >>> It takes the same amount of time to search through N index entries for >>> one >>> giant cache_dir as it does for the same N index entries for M cache_dir. >>> The >>> difference comes when Squid is aware of the individual disk I/O loading >>> and >>> sizes it can calculate accurate loading values to optimize read/write >>> latency on individual disks. >>> >>> Amos >>> >>> >> And what would be if we have 2 cache_dir >> >> cache_dir aufs /var/spool/squid/ssd1 200000 16 256 >> cache_dir aufs /var/spool/squid/ssd2 200000 16 256 >> >> /var/spool/squid/ssd1 - /dev/sda >> /var/spool/squid/ssd2 - /dev/sdb >> >> User1 download BIG psd file and squid save file on /dev/sda (ssd1). >> Then sda is failed and user2 try to download the same file. What would >> be in that situation? Does squid download file again and place file on >> /dev/sdb and then rebuild "cache" index in memory? > > > Unfortunately when a UFS cache_dir dies Squid halts. This happens whether or > not RAID is used. The exception being RAID-1 (but not RAID-10) which > provides a bit more protection than Squid at present. > > With multiple directories though you are in a position to quickly remove the > dead cache_dir and restart Squid with the second cache_dir while you work on > a fix, with RAID 0, 10, or 5 you are forced to rebuild the disk structure > while Squid is either offline or running without *any* disk cache. > > Amos