Search squid archive

Re: StoreID Question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've not got time to read your whole email, but you are asking about regular expressions.

^http:\/\/[^\.]+\.dl\.sourceforge\.net\/(.*) http://dl.sourceforge.net.squid.internal/$1

What this means is to match the first URL and "capture" the bit at the end, the bit in brackets. This then gets rewritten to the second URL with the captured bit added on to the end, that is $1. If you captured two things in brackets the first would be $1, the second $2.

Do some reading on regex and regular expressions. The basics are relatively easy to understand, beyond that, it can get very complicated very quickly. 

Robin

On Tue, 31 Dec 2024, 23:05 Jonathan Lee, <jonathanlee571@xxxxxxxxx> wrote:
Hello Fellow Squid Users,

Can you please help? I have been researching this for a long time and cannot find any information on this "what is the $ mean” within StoreID?

Below is my failed attempt to make StoreID work correctly. Sorry it's a mess. I have since disabled my customized StoreID patterns because it caused issues. My question is with regard to the $number part of the program. I disabled all the facebook and all my tests because my photos where showing up wrong and it would duplicate itself over everything, I would have to clear the cache and change items and try again below is my failed attempt to get it to work correctly. 

It did work sometimes however I would get issues the longer it went on for. I decided to stop the trial and testing of it because it was driving me crazy. It is a great puzzle to solve. Does anyone have any tips? I have some Squid text books like the Squid the definitive guide, and The Squid Proxy Server 3.1 guide still nothing really explains StoreID outside of the Squid website. Yes the website comes with a great database that does work, I tested some database items with Ubuntu updates inside of VMs and it worked and reserved them to other machines asking for the same update. So in my quest I thought can I also do this with Facebook… (I do not recommend you try it) or something else, Youtube. 

This is the Text file I have been testing and it was a failed test outside of Ubuntu updates however I do not use that OS anymore so it is removed I think I had the $ wrong I have no info on what it does some are 1 some are 5 some are doubles $ and another of them:

^https?:\/\/(fbcdn|scontent).*(akamaihd|fbcdn)\.net\/.*\/v\/.*\/(.*\.mp4) http://facebook.squid.internal/$3
^https?:\/\/fbcdn\-(static|profile)\-a\.akamaihd\.net\/static\-ak\/rsrc\.php\/((?!.*\.(?:js|css|swf)).*) http://facebook.squid.internal/static/$2
^https?:\/\/(fbcdn|scontent).*(akamaihd|fbcdn)\.net\/(h|s)(profile|photos).*\/(.*\.(png|gif|jpg))(\?.+)? http://facebook.squid.internal/$5
^https?:\/\/fbstatic\-a\.akamaihd\.net\/rsrc\.php\/((?!.*\.(?:js|css|swf)).*) http://facebook.squid.internal/static/$1
^http:\/\/.*[steampowered|steamcontent]\.com\/([^?]*) http://steamupdates.squid.internal/$1
^https?\:\/\/download\.oracle\.com\/((otn\-pub|otn)\/[\d\w]+\/[\d\w]+\/[\w\d\-]+\/[\w\d\-]+\.(exe|dmg|rpm|msi|tar\.(gz|Z)))\? http://java.oracle.otn.ngtech.squid.internal/$1
^https?\:\/\/([\d\w\-]+)\.oracle\.com\/(([\d\w]+)\/[\d\w]+\/[\d\w]+\/([\d\w\-]+)\/([\d\w]+\/)?[\d\w\-\.\_]+\.(dmg|msi|exe|tar\.gz|tar\.Z))\? http://java.oracle.download.ngtech.squid.internal/$2
^http:\/\/[^\.]+\.phobos\.apple\.com\/(.*) http://appupdates.apple.squid.internal/$1
^http:\/\/[^\.]+\.c\.android\.clients\.google\.com\/(.*) http://androidupdates.google.squid.internal/$1

My question here is:
What does this $3 mean within the the store id program?

This is the config and refresh patterns that I was learning with for Squid StoreID. Much of it is “#”ed out but this is what I was using:

#store_id_program /usr/local/libexec/squid/storeid_file_rewrite /var/squid/storeid/storeid_rewrite.txt
#store_id_children 10 startup=5 idle=1 concurrency=0
#always_direct allow all
#store_id_access deny connect
#store_id_access deny !getmethod
#store_id_access allow rewritedoms
#store_id_access deny all

refresh_all_ims on
reload_into_ims on
max_stale 20 years
minimum_expiry_time 0

#refresh_pattern -i ^http.*squid\.internal.* 43200 100% 79900 override-expire override-lastmod ignore-reload ignore-no-store ignore-must-revalidate ignore-private ignore-auth

#FACEBOOK
#refresh_pattern ^https.*.facebook.com/* 10080 80% 43200

#FACEBOOK IMAGES  
#refresh_pattern -i pixel.facebook.com..(jpg|png|gif|ico|css|js|jpg?) 10080 80% 43200
#refresh_pattern -i .akamaihd.net..(jpg|png|gif|ico|css|js|jpg?) 10080 80% 43200 
#refresh_pattern -i facebook.com.(jpg|png|gif|jpg?) 10080 80% 43200 store-stale
#refresh_pattern static.(xx|ak).fbcdn.net.(jpg|gif|png|jpg?) 10080 80% 43200
#refresh_pattern ^https.*profile.ak.fbcdn.net.*(jpg|gif|png|jpg?) 10080 80% 43200
#refresh_pattern ^https.*fbcdn.net.*(jpg|gif|png|jpg?) 10080 80% 43200

#FACEBOOK VIDEO
#refresh_pattern -i .video.ak.fbcdn.net.*.(mp4|flv|mp3|amf) 10080 80% 43200
#refresh_pattern (audio|video)/(webm|mp4) 10080 80% 43200

#APPLE STUFF
#refresh_pattern -i apple.com/..(cab|exe|msi|msu|msf|asf|wmv|wma|dat|zip|dist)$ 0 80% 43200  refresh-ims

#apple update
#refresh_pattern -i (download|adcdownload).apple.com/.*\.(pkg|dmg) 4320 100% 43200
#refresh_pattern -i appldnld\.apple\.com 129600 100% 129600
#refresh_pattern -i phobos\.apple\.com 129600 100% 129600
#refresh_pattern -i iosapps\.itunes\.apple\.com 129600 100% 129600

refresh_pattern -i windowsupdate.com/.*\.(cab|exe|ms[i|u|f|p]|[ap]sf|wm[v|a]|dat|zip|psf) 43200 80% 129600 reload-into-ims
refresh_pattern -i microsoft.com/.*\.(cab|exe|ms[i|u|f|p]|[ap]sf|wm[v|a]|dat|zip|psf) 43200 80% 129600 reload-into-ims
refresh_pattern -i windows.com/.*\.(cab|exe|ms[i|u|f|p]|[ap]sf|wm[v|a]|dat|zip|psf) 43200 80% 129600 reload-into-ims
refresh_pattern -i microsoft.com.akadns.net/.*\.(cab|exe|ms[i|u|f|p]|[ap]sf|wm[v|a]|dat|zip|psf) 43200 80% 129600 reload-into-ims

# Updates: Windows
#refresh_pattern -i microsoft.com/..(cab|exe|msi|msu|msf|asf|wma|dat|zip)$ 4320 80% 43200  refresh-ims
refresh_pattern -i windowsupdate.com/..(cab|exe|msi|msu|msf|asf|wma|wmv)|dat|zip)$ 4320 80% 43200  refresh-ims
#refresh_pattern -i windows.com/..(cab|exe|msi|msu|msf|asf|wmv|wma|dat|zip)$ 4320 80% 43200  refresh-ims
#refresh_pattern -i .*windowsupdate.com/.*\.(cab|exe) 259200 100% 259200   
#refresh_pattern -i .*update.microsoft.com/.*\.(cab|exe|dll|msi|psf) 259200 100% 259200   
#refresh_pattern windowsupdate.com/.*\.(cab|exe|dll|msi|psf) 10080 100% 43200 
#refresh_pattern download.microsoft.com/.*\.(cab|exe|dll|msi|psf) 10080 100% 43200 
#refresh_pattern www.microsoft.com/.*\.(cab|exe|dll|msi|psf) 10080 100% 43200 
#windows update NEW UPDATE 0.04
#refresh_pattern update.microsoft.com/.*\.(cab|exe) 43200 100% 129600    
#refresh_pattern ([^.]+\.)?(download|(windows)?update)\.(microsoft\.)?com/.*\.(cab|exe|msi|msp|psf) 4320 100% 43200  
#refresh_pattern update.microsoft.com/.*\.(cab|exe|dll|msi|psf) 10080 100% 43200 
#refresh_pattern -i \.update.microsoft.com/.*\.(cab|exe|ms[i|u|f]|[ap]sf|wm[v|a]|dat|zip) 525600 100% 525600       
#refresh_pattern -i \.windowsupdate.com/.*\.(cab|exe|ms[i|u|f]|[ap]sf|wm[v|a]|dat|zip) 525600 100% 525600       
#refresh_pattern -i \.download.microsoft.com/.*\.(cab|exe|ms[i|u|f]|[ap]sf|wm[v|a]|dat|zip) 525600 100% 525600       
#refresh_pattern -i \.ws.microsoft.com/.*\.(cab|exe|ms[i|u|f]|[ap]sf|wm[v|a]|dat|zip) 525600 100% 525600       
    
#refresh_pattern ([^.]+\.)?(cs|content[1-9]|hsar|content-origin|client-download).[steampowered|steamcontent].com/.*\.* 43200 100% 43200     
#refresh_pattern ([^.]+\.)?.akamai.steamstatic.com/.*\.* 43200 100% 43200

#refresh_pattern -i ([^.]+\.)?.adobe.com/.*\.(zip|exe) 43200 100% 43200
#refresh_pattern -i ([^.]+\.)?.java.com/.*\.(zip|exe) 43200 100% 43200
#refresh_pattern -i ([^.]+\.)?.sun.com/.*\.(zip|exe) 43200 100% 43200
#refresh_pattern -i ([^.]+\.)?.oracle.com/.*\.(zip|exe|tar.gz) 43200 100% 43200

#refresh_pattern -i appldnld\.apple\.com 43200 100% 43200
#refresh_pattern -i ([^.]+\.)?apple.com/.*\.(ipa) 43200 100% 43200
 
#refresh_pattern -i ([^.]+\.)?.google.com/.*\.(exe|crx) 10080 80% 43200
#refresh_pattern -i ([^.]+\.)?g.static.com/.*\.(exe|crx) 10080 80% 43200

acl https_login url_regex -i ^https.*(login|Login).*
cache deny https_login

#range_offset_limit 512 MB windowsupdate
range_offset_limit 0 !windowsupdate
quick_abort_min -1 KB


Store ID program:

I am using the built in program attached here..
/usr/local/libexec/squid/storeid_file_rewrite
#!/usr/local/bin/perl

use strict;
use warnings;
use Pod::Usage;

=pod

=head1 NAME

 storeid_file_rewrite - File based Store-ID helper for Squid

=head1 SYNOPSIS

 storeid_file_rewrite filepath

=head1 DESCRIPTION

This program acts as a store_id helper program, rewriting URLs passed
by Squid into storage-ids that can be used to achieve better caching
for websites that use different URLs for the same content.

It takes a text file with two tab separated columns.
Column 1: Regular _expression_ to match against the URL
Column 2: Rewrite rule to generate a Store-ID
Eg:
^http:\/\/[^\.]+\.dl\.sourceforge\.net\/(.*)    http://dl.sourceforge.net.squid.internal/$1

Rewrite rules are matched in the same order as they appear in the rules file.
So for best performance, sort it in order of frequency of occurrence.

This program will automatically detect the existence of a concurrency channel-ID and adjust appropriately.
It may be used with any value 0 or above for the store_id_children concurrency= parameter.

=head1 OPTIONS

The only command line parameter this helper takes is the regex rules file name.

=head1 AUTHOR

This program and documentation was written by I<Alan Mizrahi <alan@xxxxxxxxxxxxxx>>

Based on prior work by I<Eliezer Croitoru <eliezer@xxxxxxxxxxxx>>

=head1 COPYRIGHT

 * Copyright (C) 1996-2023 The Squid Software Foundation and contributors
 *
 * Squid software is distributed under GPLv2+ license and includes
 * contributions from numerous individuals and organizations.
 * Please see the COPYING and CONTRIBUTORS files for details.

 Copyright (C) 2013 Alan Mizrahi <alan@xxxxxxxxxxxxxx>
 Based on code from Eliezer Croitoru <eliezer@xxxxxxxxxxxx>

 This program is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
 the Free Software Foundation; either version 2 of the License, or
 (at your option) any later version.

 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.

 You should have received a copy of the GNU General Public License
 along with this program; if not, write to the Free Software

=head1 QUESTIONS

Questions on the usage of this program can be sent to the I<Squid Users mailing list <squid-users@xxxxxxxxxxxxxxxxxxxxx>>

=head1 REPORTING BUGS

Bug reports need to be made in English.
See http://wiki.squid-cache.org/SquidFaq/BugReporting for details of what you need to include with your bug report.

Report bugs or bug fixes using http://bugs.squid-cache.org/

Report serious security bugs to I<Squid Bugs <squid-bugs@xxxxxxxxxxxxxxxxxxxxx>>

Report ideas for new improvements to the I<Squid Developers mailing list <squid-dev@xxxxxxxxxxxxxxxxxxxxx>>

=head1 SEE ALSO

squid (8), GPL (7),


The Squid Configuration Manual http://www.squid-cache.org/Doc/config/

=cut

my @rules; # array of [regex, replacement string]

die "Usage: $0 <rewrite-file>\n" unless $#ARGV == 0;

# read config file
open RULES, $ARGV[0] or die "Error opening $ARGV[0]: $!";
while (<RULES>) {
    chomp;
    next if /^\s*#?$/;
    if (/^\s*([^\t]+?)\s*\t+\s*([^\t]+?)\s*$/) {
        push(@rules, [qr/$1/, $2]);
    } else {
        print STDERR "$0: Parse error in $ARGV[0] (line $.)\n";
    }
}
close RULES;

$|=1;
# read urls from squid and do the replacement
URL: while (<STDIN>) {
    chomp;
    last if $_ eq 'quit';

    my $channel = "";
    if (s/^(\d+\s+)//o) {
        $channel = $1;
    }

    foreach my $rule (@rules) {
        if (my @match = /$rule->[0]/) {
            $_ = $rule->[1];

            for (my $i=1; $i<=scalar(@match); $i++) {
                s/\$$i/$match[$i-1]/g;
            }
            print $channel, "OK store-id=$_\n";
            next URL;
        }
    }
    print $channel, "ERR\n";
}






_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux