Re: Need a tool to minimize HTML before storing in memecache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If you really have that much traffic, then memcache isn't your answer to
caching. It is as slow as a fast database.
You should use APC caching instead. APC will also handle a lot of
bytecode caching.

If you want to go with tidy and surf around the php issues you could
optimize the single html parts, before glueing everything together.
Maybe google page speed is worth a look for you too?

With the loggedin flag, you can save two versions of your rendered, one
for loggedin users and for not logged in users. That saves you php code
in your template and you can use tidy. And for any other variables you
can load the dynamic data after the page load.

With tidy, have you tried
http://tidy.sourceforge.net/docs/quickref.html#preserve-entities
http://tidy.sourceforge.net/docs/quickref.html#fix-uri

Regards,
Marco

Am 03.05.13 19:40, schrieb Daevid Vincent:
> Well we get about 30,000 page hits PER SECOND.
>
> So we have a template engine that generates a page using PHP/MySQL and populates it as everyone else does with the generic content. 
> Then we store THAT rendered page in a cache (memcache pool as well as a local copy on each server). 
> HOWEVER, there are of course dynamic parts of the page that can't be cached or we'd be making a cached page for every unique user. So things like their <?= $username ?>, or maybe parts of the page change based up their membership <?php if ($loggedin == true) { ?>, or maybe parts of the page rotate different content (modules if you like).
>
> Therefore we are trying to mininimize/compress the cached pages that need to be served by removing all <!-- --> and /* */ and // and whitespace and other stuff. When you have this much data to serve that fast, those few characters here and there add up quickly in bandwidth and space. As well as render time for both apache and the client's browser's parser.
>
> Dig?
>
>> -----Original Message-----
>> From: marco@xxxxxxxxxx [mailto:marco@xxxxxxxxxx]
>> Sent: Friday, May 03, 2013 4:28 AM
>> To: Daevid Vincent; 'php-general General'
>> Subject: RE:  Need a tool to minimize HTML before storing in memecache
>>
>> But why are you caching uncompiled php code?
>>
>>> Daevid Vincent <daevid@xxxxxxxxxx> hat am 2. Mai 2013 um 23:21
>> geschrieben:
>>>
>>> While that may be true for most users, I see no reason that it should
>> limit or
>>> force me to a certain use case given that dynamic pages make up the vast
>>> majority of web pages served.
>>>
>>> Secondly, there are 8 billion options in Tidy to configure it, I would be
>>> astonished if they were so short-sighted to not have one to disable
>> converting
>>> < and > to &lt; and &gt; as they do for all sorts of other things like
>> quotes,
>>> ampersands, etc. I just don't know which flag this falls under or what
>>> combination of flags I'm setting that is causing this to happen.
>>>
>>> Barring that little snag, it works like a champ.
>>>
>>>> -----Original Message-----
>>>> From: marco@xxxxxxxxxx [mailto:marco@xxxxxxxxxx]
>>>> Sent: Thursday, May 02, 2013 4:55 AM
>>>> To: Daevid Vincent; 'php-general General'
>>>> Subject: RE:  Need a tool to minimize HTML before storing in
>> memecache
>>>> This is because tidy is for optimizing HTML, not for optimizing PHP.
>>>>
>>>>> Daevid Vincent <daevid@xxxxxxxxxx> hat am 2. Mai 2013 um 02:20
>>>> geschrieben:
>>>>>
>>>>> So I took the time to install Tidy extension and wedge it into my
>> code.
>>>> Now
>>>>> there is one thing that is killing me and breaking all my pages.
>>>>>
>>>>> This is what I WANT the result to be:
>>>>>
>>>>>                 <link rel="stylesheet" type="text/css"
>>>> href="/templates/<?=
>>>>> $layout_id ?>/css/styles.css" />
>>>>>                 <link rel="stylesheet" type="text/css"
>>>> href="/templates/<?=
>>>>> $layout_id ?>/css/retina.css" media="only screen and
>>>>> (-webkit-min-device-pixel-ratio: 2)" />
>>>>>
>>>>> Which then 'renders' out to this normally without Tidy:
>>>>>
>>>>>                 <link rel="stylesheet" type="text/css"
>>>>> href="/templates/2/css/styles.css" />
>>>>>                 <link rel="stylesheet" type="text/css"
>>>>> href="/templates/2/css/retina.css" media="only screen and
>>>>> (-webkit-min-device-pixel-ratio: 2)" />
>>>>>
>>>>> This is what Tidy does:
>>>>>
>>>>>                 <link rel="stylesheet" type="text/css"
>>>>> href="/templates/%3C?=%20$layout_id%20?%3E/css/styles.css">
>>>>>                 <link rel="stylesheet" type="text/css"
>>>>> href="/templates/%3C?=%20$layout_id%20?%3E/css/retina.css" media="only
>>>>> screen and (-webkit-min-device-pixel-ratio: 2)">
>>>>>
>>>>> I found ['fix-uri' => false] which gets closer:
>>>>>
>>>>>                 <link rel="stylesheet" type="text/css"
>>>>> href="/templates/&lt;?= $layout_id ?&gt;/css/styles.css">
>>>>>                 <link rel="stylesheet" type="text/css"
>>>>> href="/templates/&lt;?= $layout_id ?&gt;/css/retina.css" media="only
>>>> screen
>>>>> and (-webkit-min-device-pixel-ratio: 2)">
>>>>>
>>>>> I've tried about every option I can think of. What is the solution to
>> make
>>>>> it stop trying to be smarter than me and converting my < and > tags??
>>>>>
>>>>> //See all parameters available here:
>>>>> http://tidy.sourceforge.net/docs/quickref.html
>>>>> $tconfig = array(
>>>>>        //'clean' => true,
>>>>>        'hide-comments' => true,
>>>>>        'hide-endtags' => true,
>>>>>        'drop-proprietary-attributes' => true,
>>>>>        //'join-classes' => true,
>>>>>        //'join-styles' => true,
>>>>>        //'quote-marks' => true,
>>>>>        'fix-uri' => false,
>>>>>        'numeric-entities' => true,
>>>>>        'preserve-entities' => true,
>>>>>        'doctype' => 'omit',
>>>>>        'tab-size' => 1,
>>>>>        'wrap' => 0,
>>>>>        'wrap-php' => false,
>>>>>        'char-encoding' => 'raw',
>>>>>        'input-encoding' => 'raw',
>>>>>        'output-encoding' => 'raw',
>>>>>        'newline' => 'LF',
>>>>>        'tidy-mark' => false,
>>>>>        'quiet' => true,
>>>>>        'show-errors' => ($this->_debug ? 6 : 0),
>>>>>        'show-warnings' => $this->_debug,
>>>>> );
>>>>>
>>>>>
>>>>> From: Joseph Moniz [mailto:joseph.moniz@xxxxxxxxx]
>>>>> Sent: Wednesday, April 17, 2013 2:55 PM
>>>>> To: Daevid Vincent
>>>>> Cc: php-general General
>>>>> Subject: Re:  Need a tool to minimize HTML before storing in
>>>> memecache
>>>>> http://php.net/manual/en/book.tidy.php
>>>>>
>>>>>
>>>>> - Joseph Moniz
>>>>> (510) 509-0775 | @josephmoniz <https://twitter.com/josephmoniz>  |
>>>>> <https://github.com/JosephMoniz> GitHub |
>>>>> <http://www.linkedin.com/pub/joseph-moniz/13/949/b54/> LinkedIn | Blog
>>>>> <http://josephmoniz.github.io/>  | CoderWall
>>>>> <https://coderwall.com/josephmoniz>
>>>>>
>>>>> "Wake up early, Stay up late, Change the world"
>>>>>
>>>>> On Wed, Apr 17, 2013 at 2:52 PM, Daevid Vincent <daevid@xxxxxxxxxx>
>> wrote:
>>>>> We do a lot with caching and storing in memecached as well as local
>> copies
>>>>> so as to not hit the cache pool over the network and we have found
>> some
>>>>> great tools to minimize our javascript and our css, and now we'd like
>> to
>>>>> compress our HTML in these cache slabs.
>>>>>
>>>>>
>>>>>
>>>>> Anyone know of a good tool or even regex magic that I can call from
>> PHP to
>>>>> compress/minimize the giant string web page before I store it in the
>>>> cache?
>>>>>
>>>>>
>>>>> It's not quite as simple as stripping white space b/c obviously there
>> are
>>>>> spaces between attributes in tags that need to be preserved, but also
>> in
>>>> the
>>>>> words/text on the page. I could strip out newlines I suppose, but then
>> do
>>>> I
>>>>> run into any issues in other ways? In any event, it seems like someone
>>>> would
>>>>> have solved this by now before I go re-inventing the wheel.
>>>>>
>>>>>
>>>>>
>>>>> d.
>>>>>
>>>> --
>>>> Marco Behnke
>>>> Dipl. Informatiker (FH), SAE Audio Engineer Diploma
>>>> Zend Certified Engineer PHP 5.3
>>>>
>>>> Tel.: 0174 / 9722336
>>>> e-Mail: marco@xxxxxxxxxx
>>>>
>>>> Softwaretechnik Behnke
>>>> Heinrich-Heine-Str. 7D
>>>> 21218 Seevetal
>>>>
>>>> http://www.behnke.biz
>> --
>> Marco Behnke
>> Dipl. Informatiker (FH), SAE Audio Engineer Diploma
>> Zend Certified Engineer PHP 5.3
>>
>> Tel.: 0174 / 9722336
>> e-Mail: marco@xxxxxxxxxx
>>
>> Softwaretechnik Behnke
>> Heinrich-Heine-Str. 7D
>> 21218 Seevetal
>>
>> http://www.behnke.biz
>


-- 
Marco Behnke
Dipl. Informatiker (FH), SAE Audio Engineer Diploma
Zend Certified Engineer PHP 5.3

Tel.: 0174 / 9722336
e-Mail: marco@xxxxxxxxxx

Softwaretechnik Behnke
Heinrich-Heine-Str. 7D
21218 Seevetal

http://www.behnke.biz


Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux