Re: [RFC PATCHv2 04/10] gitweb: Use Cache::Cache compatibile (get, set) output caching

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 10 Feb 2010, Petr Baudis wrote:
> On Wed, Feb 10, 2010 at 02:12:24AM +0100, Jakub Narebski wrote:
> > On Tue, 9 Feb 2010 at 11:30 +0100, Jakub Narebski wrote:
> > 
> > > The cache_fetch subroutine captures output (from STDOUT only, as
> > > STDERR is usually logged) using either ->push_layer()/->pop_layer()
> > > from PerlIO::Util submodule (if it is available), or by setting and
> > > restoring *STDOUT.  Note that only the former could be tested reliably
> > > to be reliable in t9503 test!
> > 
> > Scratch that, I have just checked that (at least for Apache + mod_cgi,
> > but I don't think that it matters) the latter solution, with setting
> > and restoring *STDOUT doesn't work: I would get data in cache (so it
> > can be restored later), but instead of output I would get Internal Server
> > Error ("The server encountered an internal error or misconfiguration and
> > was unable to complete your request.") without even a hint what the
> > problem was.  Sprinkling "die ...: $!" didn't help to catch this error:
> > I suspect that the problem is with capturing.
> > 
> > So we either would have to live with non-core PerlIO::Util or (pure Perl)
> > Capture::Tiny, or do the 'print -> print $out' patch...
> 
> All the magic methods seem to be troublesome, but in that case I'd
> really prefer a level of indirection instead of filehandle - as is,
> 'print (...) -> output (...)' ins. of 'print (...) -> print $out (...)'
> (or whatever). That should be really flexible and completely
> futureproof, and I don't think the level of indirection would incur any
> measurable overhead, would it?

First, it is not only 'print (...) -> print $out (...)'; you need to
do all those:

   print  <sth>            ->  print  $out <sth>
   printf <sth>            ->  printf $out <sth>
   binmode STDOUT, <mode>  ->  binmode $out, <mode>

Second, using "tie" on filehandle (on *STDOUT) can be used also for 
just capturing output, not only for "tee"-ing; what's more to print
while capturing one has to do extra work.  It is quite similar to
replacing 'print (...)' with 'output (...)' etc., but using
tie/untie doesn't require large patch to gitweb.

Third, as you can see below tie-ing is about 1% slower than using
'output (...)', which in turn is less than 10% slower than explicit
filehandle solution i.e. 'print $out (...)'... and is almost twice
slower than solution using PerlIO::Util

Benchmark: timing 50000 iterations of output, perlio, print \$out, tie *STDOUT...
     output: 1.81462 wallclock secs ( 1.77 usr +  0.00 sys =  1.77 CPU) @ 28248.59/s (n=50000)
     perlio: 1.05585 wallclock secs ( 1.03 usr +  0.00 sys =  1.03 CPU) @ 48543.69/s (n=50000)
print \$out: 1.70027 wallclock secs ( 1.66 usr +  0.00 sys =  1.66 CPU) @ 30120.48/s (n=50000)
tie *STDOUT: 1.82248 wallclock secs ( 1.79 usr +  0.00 sys =  1.79 CPU) @ 27932.96/s (n=50000)
               Rate tie *STDOUT      output print \$out      perlio
tie *STDOUT 27933/s          --         -1%         -7%        -42%
output      28249/s          1%          --         -6%        -42%
print \$out 30120/s          8%          7%          --        -38%
perlio      48544/s         74%         72%         61%          --

Benchmark: running output, perlio, print \$out, tie *STDOUT for at least 10 CPU seconds...
     output: 10.7199 wallclock secs (10.53 usr +  0.00 sys = 10.53 CPU) @ 28029.63/s (n=295152)
     perlio: 11.2884 wallclock secs (10.46 usr +  0.00 sys = 10.46 CPU) @ 49967.11/s (n=522656)
print \$out: 10.5978 wallclock secs (10.43 usr +  0.00 sys = 10.43 CPU) @ 30318.79/s (n=316225)
tie *STDOUT: 11.3525 wallclock secs (10.68 usr +  0.00 sys = 10.68 CPU) @ 27635.96/s (n=295152)
               Rate tie *STDOUT      output print \$out      perlio
tie *STDOUT 27636/s          --         -1%         -9%        -45%
output      28030/s          1%          --         -8%        -44%
print \$out 30319/s         10%          8%          --        -39%
perlio      49967/s         81%         78%         65%          --
need

Attached there is script that was used to produce those results.
-- 
Jakub Narebski
Poland
#!/usr/bin/perl

use strict;
use warnings;

use Test::More;
use Time::HiRes;
use Benchmark qw(:all :hireswallclock);
use Data::Dumper;

use PerlIO::Util;

my $chunk = "test\n";

my $lorem = <<'EOF';
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.

Sed ut perspiciatis unde omnis iste natus error sit voluptatem
accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae
ab illo inventore veritatis et quasi architecto beatae vitae dicta
sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit
aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos
qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui
dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed
quia non numquam eius modi tempora incidunt ut labore et dolore magnam
aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum
exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex
ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in
ea voluptate velit esse quam nihil molestiae consequatur, vel illum
qui dolorem eum fugiat quo voluptas nulla pariatur?
EOF

sub capture_perlio {
	my $data = '';

	*STDOUT->push_layer(scalar => \$data);

	print $chunk;

	*STDOUT->pop_layer();

	return $data;
}

my $out = \*STDOUT;

sub capture_filehandle {
	my $data = '';

	open my $data_fh, '>', \$data;
	$out = $data_fh;

	print $out $chunk;

	$out = \*STDOUT;
	close $data_fh;

	return $data;
}

sub output {
	print $out @_;
}

sub capture_indirection {
	my $data = '';

	open my $data_fh, '>', \$data;
	$out = $data_fh;

	output $chunk;

	$out = \*STDOUT;
	close $data_fh;

	return $data;
}

sub capture_tied {
	my $data = '';

	tie *STDOUT, 'CatchSTDOUT', \$data;

	print $chunk;

	untie *STDOUT;

	return $data;
}

{
	package CatchSTDOUT;

	use strict;
	use warnings;
	use Data::Dumper;

	sub TIEHANDLE {
		my ($proto, $dataref) = @_;
		my $class = ref($proto) || $proto;

		my $self = {};
		$self = bless($self, $class);
		$self->{'scalar'} = $dataref;
		return $self;
	}

	sub PRINT {
		my $self = shift;
		${$self->{'scalar'}} .= join('',@_);
	}
}

# ----------------------------------------------------------------------

print 'capture_perlio      = '.capture_perlio();
print 'capture_filehandle  = '.capture_filehandle();
print 'capture_indirection = '.capture_indirection();
print 'capture_tied        = '.capture_tied();

$chunk = "another test\n";
print 'capture_perlio      = '.capture_perlio();
print 'capture_filehandle  = '.capture_filehandle();
print 'capture_indirection = '.capture_indirection();
print 'capture_tied        = '.capture_tied();

print "\n";


$chunk = $lorem;

my $result = timethese(50_000, {
	'perlio'      => \&capture_perlio,
	'print \$out' => \&capture_filehandle,
	'output'      => \&capture_indirection,
	'tie *STDOUT' => \&capture_indirection,
});
cmpthese($result);
print "\n";

$result = timethese(-10, {
	'perlio'      => \&capture_perlio,
	'print \$out' => \&capture_filehandle,
	'output'      => \&capture_indirection,
	'tie *STDOUT' => \&capture_indirection,
});
cmpthese($result);
print "\n";

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]