RE: Need a tool to minimize HTML before storing in memecache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Marco Behnke [mailto:marco@xxxxxxxxxx]
> Sent: Friday, May 03, 2013 12:01 PM
> To: Daevid Vincent; php >> "php-general@xxxxxxxxxxxxx"
> Subject: Re:  Need a tool to minimize HTML before storing in memecache
> 
> If you really have that much traffic, then memcache isn't your answer to
> caching. It is as slow as a fast database.

That's not entirely true. 

> You should use APC caching instead. APC will also handle a lot of
> bytecode caching.

We have both.

> If you want to go with tidy and surf around the php issues you could
> optimize the single html parts, before glueing everything together.

That would require much more work than simply getting < and > to work. And honestly I've been hacking around Tidy so much at this point with regex to minify the output, that I'm even wondering if Tidy is worth the both anymore. Not sure what else it will give me.

> Maybe google page speed is worth a look for you too?

We have over 1,000 servers in house and also distributed across nodes in various cities and countries. 

> With the loggedin flag, you can save two versions of your rendered, one
> for loggedin users and for not logged in users. That saves you php code
> in your template and you can use tidy. And for any other variables you
> can load the dynamic data after the page load.

I gave simplistic examples for the sake of illustration.

> With tidy, have you tried
> http://tidy.sourceforge.net/docs/quickref.html#preserve-entities
> http://tidy.sourceforge.net/docs/quickref.html#fix-uri

Yes. See below. I posted all the flags I have tried and I too thought those were the key, but sadly not.

> Regards,
> Marco
> 
> Am 03.05.13 19:40, schrieb Daevid Vincent:
> > Well we get about 30,000 page hits PER SECOND.
> >
> > So we have a template engine that generates a page using PHP/MySQL and
> populates it as everyone else does with the generic content.
> > Then we store THAT rendered page in a cache (memcache pool as well as a
> local copy on each server).
> > HOWEVER, there are of course dynamic parts of the page that can't be
> cached or we'd be making a cached page for every unique user. So things like
> their <?= $username ?>, or maybe parts of the page change based up their
> membership <?php if ($loggedin == true) { ?>, or maybe parts of the page
> rotate different content (modules if you like).
> >
> > Therefore we are trying to mininimize/compress the cached pages that need
> to be served by removing all <!-- --> and /* */ and // and whitespace and
> other stuff. When you have this much data to serve that fast, those few
> characters here and there add up quickly in bandwidth and space. As well as
> render time for both apache and the client's browser's parser.
> >
> > Dig?
> >
> >> -----Original Message-----
> >> From: marco@xxxxxxxxxx [mailto:marco@xxxxxxxxxx]
> >> Sent: Friday, May 03, 2013 4:28 AM
> >> To: Daevid Vincent; 'php-general General'
> >> Subject: RE:  Need a tool to minimize HTML before storing in
> memecache
> >>
> >> But why are you caching uncompiled php code?
> >>
> >>> Daevid Vincent <daevid@xxxxxxxxxx> hat am 2. Mai 2013 um 23:21
> >> geschrieben:
> >>>
> >>> While that may be true for most users, I see no reason that it should
> >> limit or
> >>> force me to a certain use case given that dynamic pages make up the vast
> >>> majority of web pages served.
> >>>
> >>> Secondly, there are 8 billion options in Tidy to configure it, I would
> be
> >>> astonished if they were so short-sighted to not have one to disable
> >> converting
> >>> < and > to &lt; and &gt; as they do for all sorts of other things like
> >> quotes,
> >>> ampersands, etc. I just don't know which flag this falls under or what
> >>> combination of flags I'm setting that is causing this to happen.
> >>>
> >>> Barring that little snag, it works like a champ.
> >>>
> >>>> -----Original Message-----
> >>>> From: marco@xxxxxxxxxx [mailto:marco@xxxxxxxxxx]
> >>>> Sent: Thursday, May 02, 2013 4:55 AM
> >>>> To: Daevid Vincent; 'php-general General'
> >>>> Subject: RE:  Need a tool to minimize HTML before storing in
> >> memecache
> >>>> This is because tidy is for optimizing HTML, not for optimizing PHP.
> >>>>
> >>>>> Daevid Vincent <daevid@xxxxxxxxxx> hat am 2. Mai 2013 um 02:20
> >>>> geschrieben:
> >>>>>
> >>>>> So I took the time to install Tidy extension and wedge it into my
> >> code.
> >>>> Now
> >>>>> there is one thing that is killing me and breaking all my pages.
> >>>>>
> >>>>> This is what I WANT the result to be:
> >>>>>
> >>>>>                 <link rel="stylesheet" type="text/css"
> >>>> href="/templates/<?=
> >>>>> $layout_id ?>/css/styles.css" />
> >>>>>                 <link rel="stylesheet" type="text/css"
> >>>> href="/templates/<?=
> >>>>> $layout_id ?>/css/retina.css" media="only screen and
> >>>>> (-webkit-min-device-pixel-ratio: 2)" />
> >>>>>
> >>>>> Which then 'renders' out to this normally without Tidy:
> >>>>>
> >>>>>                 <link rel="stylesheet" type="text/css"
> >>>>> href="/templates/2/css/styles.css" />
> >>>>>                 <link rel="stylesheet" type="text/css"
> >>>>> href="/templates/2/css/retina.css" media="only screen and
> >>>>> (-webkit-min-device-pixel-ratio: 2)" />
> >>>>>
> >>>>> This is what Tidy does:
> >>>>>
> >>>>>                 <link rel="stylesheet" type="text/css"
> >>>>> href="/templates/%3C?=%20$layout_id%20?%3E/css/styles.css">
> >>>>>                 <link rel="stylesheet" type="text/css"
> >>>>> href="/templates/%3C?=%20$layout_id%20?%3E/css/retina.css" media="only
> >>>>> screen and (-webkit-min-device-pixel-ratio: 2)">
> >>>>>
> >>>>> I found ['fix-uri' => false] which gets closer:
> >>>>>
> >>>>>                 <link rel="stylesheet" type="text/css"
> >>>>> href="/templates/&lt;?= $layout_id ?&gt;/css/styles.css">
> >>>>>                 <link rel="stylesheet" type="text/css"
> >>>>> href="/templates/&lt;?= $layout_id ?&gt;/css/retina.css" media="only
> >>>> screen
> >>>>> and (-webkit-min-device-pixel-ratio: 2)">
> >>>>>
> >>>>> I've tried about every option I can think of. What is the solution to
> >> make
> >>>>> it stop trying to be smarter than me and converting my < and > tags??
> >>>>>
> >>>>> //See all parameters available here:
> >>>>> http://tidy.sourceforge.net/docs/quickref.html
> >>>>> $tconfig = array(
> >>>>>        //'clean' => true,
> >>>>>        'hide-comments' => true,
> >>>>>        'hide-endtags' => true,
> >>>>>        'drop-proprietary-attributes' => true,
> >>>>>        //'join-classes' => true,
> >>>>>        //'join-styles' => true,
> >>>>>        //'quote-marks' => true,
> >>>>>        'fix-uri' => false,
> >>>>>        'numeric-entities' => true,
> >>>>>        'preserve-entities' => true,
> >>>>>        'doctype' => 'omit',
> >>>>>        'tab-size' => 1,
> >>>>>        'wrap' => 0,
> >>>>>        'wrap-php' => false,
> >>>>>        'char-encoding' => 'raw',
> >>>>>        'input-encoding' => 'raw',
> >>>>>        'output-encoding' => 'raw',
> >>>>>        'newline' => 'LF',
> >>>>>        'tidy-mark' => false,
> >>>>>        'quiet' => true,
> >>>>>        'show-errors' => ($this->_debug ? 6 : 0),
> >>>>>        'show-warnings' => $this->_debug,
> >>>>> );
> >>>>>
> >>>>>
> >>>>> From: Joseph Moniz [mailto:joseph.moniz@xxxxxxxxx]
> >>>>> Sent: Wednesday, April 17, 2013 2:55 PM
> >>>>> To: Daevid Vincent
> >>>>> Cc: php-general General
> >>>>> Subject: Re:  Need a tool to minimize HTML before storing in
> >>>> memecache
> >>>>> http://php.net/manual/en/book.tidy.php
> >>>>>
> >>>>>
> >>>>> - Joseph Moniz
> >>>>> (510) 509-0775 | @josephmoniz <https://twitter.com/josephmoniz>  |
> >>>>> <https://github.com/JosephMoniz> GitHub |
> >>>>> <http://www.linkedin.com/pub/joseph-moniz/13/949/b54/> LinkedIn | Blog
> >>>>> <http://josephmoniz.github.io/>  | CoderWall
> >>>>> <https://coderwall.com/josephmoniz>
> >>>>>
> >>>>> "Wake up early, Stay up late, Change the world"
> >>>>>
> >>>>> On Wed, Apr 17, 2013 at 2:52 PM, Daevid Vincent <daevid@xxxxxxxxxx>
> >> wrote:
> >>>>> We do a lot with caching and storing in memecached as well as local
> >> copies
> >>>>> so as to not hit the cache pool over the network and we have found
> >> some
> >>>>> great tools to minimize our javascript and our css, and now we'd like
> >> to
> >>>>> compress our HTML in these cache slabs.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Anyone know of a good tool or even regex magic that I can call from
> >> PHP to
> >>>>> compress/minimize the giant string web page before I store it in the
> >>>> cache?
> >>>>>
> >>>>>
> >>>>> It's not quite as simple as stripping white space b/c obviously there
> >> are
> >>>>> spaces between attributes in tags that need to be preserved, but also
> >> in
> >>>> the
> >>>>> words/text on the page. I could strip out newlines I suppose, but then
> >> do
> >>>> I
> >>>>> run into any issues in other ways? In any event, it seems like someone
> >>>> would
> >>>>> have solved this by now before I go re-inventing the wheel.
> >>>>>
> >>>>>
> >>>>>
> >>>>> d.
> >>>>>
> >>>> --
> >>>> Marco Behnke
> >>>> Dipl. Informatiker (FH), SAE Audio Engineer Diploma
> >>>> Zend Certified Engineer PHP 5.3
> >>>>
> >>>> Tel.: 0174 / 9722336
> >>>> e-Mail: marco@xxxxxxxxxx
> >>>>
> >>>> Softwaretechnik Behnke
> >>>> Heinrich-Heine-Str. 7D
> >>>> 21218 Seevetal
> >>>>
> >>>> http://www.behnke.biz
> >> --
> >> Marco Behnke
> >> Dipl. Informatiker (FH), SAE Audio Engineer Diploma
> >> Zend Certified Engineer PHP 5.3
> >>
> >> Tel.: 0174 / 9722336
> >> e-Mail: marco@xxxxxxxxxx
> >>
> >> Softwaretechnik Behnke
> >> Heinrich-Heine-Str. 7D
> >> 21218 Seevetal
> >>
> >> http://www.behnke.biz
> >
> 
> 
> --
> Marco Behnke
> Dipl. Informatiker (FH), SAE Audio Engineer Diploma
> Zend Certified Engineer PHP 5.3
> 
> Tel.: 0174 / 9722336
> e-Mail: marco@xxxxxxxxxx
> 
> Softwaretechnik Behnke
> Heinrich-Heine-Str. 7D
> 21218 Seevetal
> 
> http://www.behnke.biz
> 



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php






[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux