Re: Need a tool to minimize HTML before storing in memecache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2013/5/3 Daevid Vincent <daevid@xxxxxxxxxx>

>
> > -----Original Message-----
> > From: Marco Behnke [mailto:marco@xxxxxxxxxx]
> > Sent: Friday, May 03, 2013 12:01 PM
> > To: Daevid Vincent; php >> "php-general@xxxxxxxxxxxxx"
> > Subject: Re:  Need a tool to minimize HTML before storing in
> memecache
> >
> > If you really have that much traffic, then memcache isn't your answer to
> > caching. It is as slow as a fast database.
>
> That's not entirely true.
>
> > You should use APC caching instead. APC will also handle a lot of
> > bytecode caching.
>
> We have both.
>
> > If you want to go with tidy and surf around the php issues you could
> > optimize the single html parts, before glueing everything together.
>
> That would require much more work than simply getting < and > to work. And
> honestly I've been hacking around Tidy so much at this point with regex to
> minify the output, that I'm even wondering if Tidy is worth the both
> anymore. Not sure what else it will give me.
>
> > Maybe google page speed is worth a look for you too?
>
> We have over 1,000 servers in house and also distributed across nodes in
> various cities and countries.
>

Really? Ever considered HTTP-Caching, or even Load-Balancing including ESI?


>
> > With the loggedin flag, you can save two versions of your rendered, one
> > for loggedin users and for not logged in users. That saves you php code
> > in your template and you can use tidy. And for any other variables you
> > can load the dynamic data after the page load.
>
> I gave simplistic examples for the sake of illustration.
>
> > With tidy, have you tried
> > http://tidy.sourceforge.net/docs/quickref.html#preserve-entities
> > http://tidy.sourceforge.net/docs/quickref.html#fix-uri
>
> Yes. See below. I posted all the flags I have tried and I too thought
> those were the key, but sadly not.
>
> > Regards,
> > Marco
> >
> > Am 03.05.13 19:40, schrieb Daevid Vincent:
> > > Well we get about 30,000 page hits PER SECOND.
> > >
> > > So we have a template engine that generates a page using PHP/MySQL and
> > populates it as everyone else does with the generic content.
> > > Then we store THAT rendered page in a cache (memcache pool as well as a
> > local copy on each server).
> > > HOWEVER, there are of course dynamic parts of the page that can't be
> > cached or we'd be making a cached page for every unique user. So things
> like
> > their <?= $username ?>, or maybe parts of the page change based up their
> > membership <?php if ($loggedin == true) { ?>, or maybe parts of the page
> > rotate different content (modules if you like).
> > >
> > > Therefore we are trying to mininimize/compress the cached pages that
> need
> > to be served by removing all <!-- --> and /* */ and // and whitespace and
> > other stuff. When you have this much data to serve that fast, those few
> > characters here and there add up quickly in bandwidth and space. As well
> as
> > render time for both apache and the client's browser's parser.
> > >
> > > Dig?
> > >
> > >> -----Original Message-----
> > >> From: marco@xxxxxxxxxx [mailto:marco@xxxxxxxxxx]
> > >> Sent: Friday, May 03, 2013 4:28 AM
> > >> To: Daevid Vincent; 'php-general General'
> > >> Subject: RE:  Need a tool to minimize HTML before storing in
> > memecache
> > >>
> > >> But why are you caching uncompiled php code?
> > >>
> > >>> Daevid Vincent <daevid@xxxxxxxxxx> hat am 2. Mai 2013 um 23:21
> > >> geschrieben:
> > >>>
> > >>> While that may be true for most users, I see no reason that it should
> > >> limit or
> > >>> force me to a certain use case given that dynamic pages make up the
> vast
> > >>> majority of web pages served.
> > >>>
> > >>> Secondly, there are 8 billion options in Tidy to configure it, I
> would
> > be
> > >>> astonished if they were so short-sighted to not have one to disable
> > >> converting
> > >>> < and > to &lt; and &gt; as they do for all sorts of other things
> like
> > >> quotes,
> > >>> ampersands, etc. I just don't know which flag this falls under or
> what
> > >>> combination of flags I'm setting that is causing this to happen.
> > >>>
> > >>> Barring that little snag, it works like a champ.
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: marco@xxxxxxxxxx [mailto:marco@xxxxxxxxxx]
> > >>>> Sent: Thursday, May 02, 2013 4:55 AM
> > >>>> To: Daevid Vincent; 'php-general General'
> > >>>> Subject: RE:  Need a tool to minimize HTML before storing in
> > >> memecache
> > >>>> This is because tidy is for optimizing HTML, not for optimizing PHP.
> > >>>>
> > >>>>> Daevid Vincent <daevid@xxxxxxxxxx> hat am 2. Mai 2013 um 02:20
> > >>>> geschrieben:
> > >>>>>
> > >>>>> So I took the time to install Tidy extension and wedge it into my
> > >> code.
> > >>>> Now
> > >>>>> there is one thing that is killing me and breaking all my pages.
> > >>>>>
> > >>>>> This is what I WANT the result to be:
> > >>>>>
> > >>>>>                 <link rel="stylesheet" type="text/css"
> > >>>> href="/templates/<?=
> > >>>>> $layout_id ?>/css/styles.css" />
> > >>>>>                 <link rel="stylesheet" type="text/css"
> > >>>> href="/templates/<?=
> > >>>>> $layout_id ?>/css/retina.css" media="only screen and
> > >>>>> (-webkit-min-device-pixel-ratio: 2)" />
> > >>>>>
> > >>>>> Which then 'renders' out to this normally without Tidy:
> > >>>>>
> > >>>>>                 <link rel="stylesheet" type="text/css"
> > >>>>> href="/templates/2/css/styles.css" />
> > >>>>>                 <link rel="stylesheet" type="text/css"
> > >>>>> href="/templates/2/css/retina.css" media="only screen and
> > >>>>> (-webkit-min-device-pixel-ratio: 2)" />
> > >>>>>
> > >>>>> This is what Tidy does:
> > >>>>>
> > >>>>>                 <link rel="stylesheet" type="text/css"
> > >>>>> href="/templates/%3C?=%20$layout_id%20?%3E/css/styles.css">
> > >>>>>                 <link rel="stylesheet" type="text/css"
> > >>>>> href="/templates/%3C?=%20$layout_id%20?%3E/css/retina.css"
> media="only
> > >>>>> screen and (-webkit-min-device-pixel-ratio: 2)">
> > >>>>>
> > >>>>> I found ['fix-uri' => false] which gets closer:
> > >>>>>
> > >>>>>                 <link rel="stylesheet" type="text/css"
> > >>>>> href="/templates/&lt;?= $layout_id ?&gt;/css/styles.css">
> > >>>>>                 <link rel="stylesheet" type="text/css"
> > >>>>> href="/templates/&lt;?= $layout_id ?&gt;/css/retina.css"
> media="only
> > >>>> screen
> > >>>>> and (-webkit-min-device-pixel-ratio: 2)">
> > >>>>>
> > >>>>> I've tried about every option I can think of. What is the solution
> to
> > >> make
> > >>>>> it stop trying to be smarter than me and converting my < and >
> tags??
> > >>>>>
> > >>>>> //See all parameters available here:
> > >>>>> http://tidy.sourceforge.net/docs/quickref.html
> > >>>>> $tconfig = array(
> > >>>>>        //'clean' => true,
> > >>>>>        'hide-comments' => true,
> > >>>>>        'hide-endtags' => true,
> > >>>>>        'drop-proprietary-attributes' => true,
> > >>>>>        //'join-classes' => true,
> > >>>>>        //'join-styles' => true,
> > >>>>>        //'quote-marks' => true,
> > >>>>>        'fix-uri' => false,
> > >>>>>        'numeric-entities' => true,
> > >>>>>        'preserve-entities' => true,
> > >>>>>        'doctype' => 'omit',
> > >>>>>        'tab-size' => 1,
> > >>>>>        'wrap' => 0,
> > >>>>>        'wrap-php' => false,
> > >>>>>        'char-encoding' => 'raw',
> > >>>>>        'input-encoding' => 'raw',
> > >>>>>        'output-encoding' => 'raw',
> > >>>>>        'newline' => 'LF',
> > >>>>>        'tidy-mark' => false,
> > >>>>>        'quiet' => true,
> > >>>>>        'show-errors' => ($this->_debug ? 6 : 0),
> > >>>>>        'show-warnings' => $this->_debug,
> > >>>>> );
> > >>>>>
> > >>>>>
> > >>>>> From: Joseph Moniz [mailto:joseph.moniz@xxxxxxxxx]
> > >>>>> Sent: Wednesday, April 17, 2013 2:55 PM
> > >>>>> To: Daevid Vincent
> > >>>>> Cc: php-general General
> > >>>>> Subject: Re:  Need a tool to minimize HTML before storing in
> > >>>> memecache
> > >>>>> http://php.net/manual/en/book.tidy.php
> > >>>>>
> > >>>>>
> > >>>>> - Joseph Moniz
> > >>>>> (510) 509-0775 | @josephmoniz <https://twitter.com/josephmoniz>  |
> > >>>>> <https://github.com/JosephMoniz> GitHub |
> > >>>>> <http://www.linkedin.com/pub/joseph-moniz/13/949/b54/> LinkedIn |
> Blog
> > >>>>> <http://josephmoniz.github.io/>  | CoderWall
> > >>>>> <https://coderwall.com/josephmoniz>
> > >>>>>
> > >>>>> "Wake up early, Stay up late, Change the world"
> > >>>>>
> > >>>>> On Wed, Apr 17, 2013 at 2:52 PM, Daevid Vincent <daevid@xxxxxxxxxx
> >
> > >> wrote:
> > >>>>> We do a lot with caching and storing in memecached as well as local
> > >> copies
> > >>>>> so as to not hit the cache pool over the network and we have found
> > >> some
> > >>>>> great tools to minimize our javascript and our css, and now we'd
> like
> > >> to
> > >>>>> compress our HTML in these cache slabs.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> Anyone know of a good tool or even regex magic that I can call from
> > >> PHP to
> > >>>>> compress/minimize the giant string web page before I store it in
> the
> > >>>> cache?
> > >>>>>
> > >>>>>
> > >>>>> It's not quite as simple as stripping white space b/c obviously
> there
> > >> are
> > >>>>> spaces between attributes in tags that need to be preserved, but
> also
> > >> in
> > >>>> the
> > >>>>> words/text on the page. I could strip out newlines I suppose, but
> then
> > >> do
> > >>>> I
> > >>>>> run into any issues in other ways? In any event, it seems like
> someone
> > >>>> would
> > >>>>> have solved this by now before I go re-inventing the wheel.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> d.
> > >>>>>
> > >>>> --
> > >>>> Marco Behnke
> > >>>> Dipl. Informatiker (FH), SAE Audio Engineer Diploma
> > >>>> Zend Certified Engineer PHP 5.3
> > >>>>
> > >>>> Tel.: 0174 / 9722336
> > >>>> e-Mail: marco@xxxxxxxxxx
> > >>>>
> > >>>> Softwaretechnik Behnke
> > >>>> Heinrich-Heine-Str. 7D
> > >>>> 21218 Seevetal
> > >>>>
> > >>>> http://www.behnke.biz
> > >> --
> > >> Marco Behnke
> > >> Dipl. Informatiker (FH), SAE Audio Engineer Diploma
> > >> Zend Certified Engineer PHP 5.3
> > >>
> > >> Tel.: 0174 / 9722336
> > >> e-Mail: marco@xxxxxxxxxx
> > >>
> > >> Softwaretechnik Behnke
> > >> Heinrich-Heine-Str. 7D
> > >> 21218 Seevetal
> > >>
> > >> http://www.behnke.biz
> > >
> >
> >
> > --
> > Marco Behnke
> > Dipl. Informatiker (FH), SAE Audio Engineer Diploma
> > Zend Certified Engineer PHP 5.3
> >
> > Tel.: 0174 / 9722336
> > e-Mail: marco@xxxxxxxxxx
> >
> > Softwaretechnik Behnke
> > Heinrich-Heine-Str. 7D
> > 21218 Seevetal
> >
> > http://www.behnke.biz
> >
>
>
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>


-- 
github.com/KingCrunch

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux