On Wed, 10 Feb 2010, Petr Baudis wrote: > On Wed, Feb 10, 2010 at 02:12:24AM +0100, Jakub Narebski wrote: > > On Tue, 9 Feb 2010 at 11:30 +0100, Jakub Narebski wrote: > > > > > The cache_fetch subroutine captures output (from STDOUT only, as > > > STDERR is usually logged) using either ->push_layer()/->pop_layer() > > > from PerlIO::Util submodule (if it is available), or by setting and > > > restoring *STDOUT. Note that only the former could be tested reliably > > > to be reliable in t9503 test! > > > > Scratch that, I have just checked that (at least for Apache + mod_cgi, > > but I don't think that it matters) the latter solution, with setting > > and restoring *STDOUT doesn't work: I would get data in cache (so it > > can be restored later), but instead of output I would get Internal Server > > Error ("The server encountered an internal error or misconfiguration and > > was unable to complete your request.") without even a hint what the > > problem was. Sprinkling "die ...: $!" didn't help to catch this error: > > I suspect that the problem is with capturing. > > > > So we either would have to live with non-core PerlIO::Util or (pure Perl) > > Capture::Tiny, or do the 'print -> print $out' patch... > > All the magic methods seem to be troublesome, but in that case I'd > really prefer a level of indirection instead of filehandle - as is, > 'print (...) -> output (...)' ins. of 'print (...) -> print $out (...)' > (or whatever). That should be really flexible and completely > futureproof, and I don't think the level of indirection would incur any > measurable overhead, would it? First, it is not only 'print (...) -> print $out (...)'; you need to do all those: print <sth> -> print $out <sth> printf <sth> -> printf $out <sth> binmode STDOUT, <mode> -> binmode $out, <mode> Second, using "tie" on filehandle (on *STDOUT) can be used also for just capturing output, not only for "tee"-ing; what's more to print while capturing one has to do extra work. It is quite similar to replacing 'print (...)' with 'output (...)' etc., but using tie/untie doesn't require large patch to gitweb. Third, as you can see below tie-ing is about 1% slower than using 'output (...)', which in turn is less than 10% slower than explicit filehandle solution i.e. 'print $out (...)'... and is almost twice slower than solution using PerlIO::Util Benchmark: timing 50000 iterations of output, perlio, print \$out, tie *STDOUT... output: 1.81462 wallclock secs ( 1.77 usr + 0.00 sys = 1.77 CPU) @ 28248.59/s (n=50000) perlio: 1.05585 wallclock secs ( 1.03 usr + 0.00 sys = 1.03 CPU) @ 48543.69/s (n=50000) print \$out: 1.70027 wallclock secs ( 1.66 usr + 0.00 sys = 1.66 CPU) @ 30120.48/s (n=50000) tie *STDOUT: 1.82248 wallclock secs ( 1.79 usr + 0.00 sys = 1.79 CPU) @ 27932.96/s (n=50000) Rate tie *STDOUT output print \$out perlio tie *STDOUT 27933/s -- -1% -7% -42% output 28249/s 1% -- -6% -42% print \$out 30120/s 8% 7% -- -38% perlio 48544/s 74% 72% 61% -- Benchmark: running output, perlio, print \$out, tie *STDOUT for at least 10 CPU seconds... output: 10.7199 wallclock secs (10.53 usr + 0.00 sys = 10.53 CPU) @ 28029.63/s (n=295152) perlio: 11.2884 wallclock secs (10.46 usr + 0.00 sys = 10.46 CPU) @ 49967.11/s (n=522656) print \$out: 10.5978 wallclock secs (10.43 usr + 0.00 sys = 10.43 CPU) @ 30318.79/s (n=316225) tie *STDOUT: 11.3525 wallclock secs (10.68 usr + 0.00 sys = 10.68 CPU) @ 27635.96/s (n=295152) Rate tie *STDOUT output print \$out perlio tie *STDOUT 27636/s -- -1% -9% -45% output 28030/s 1% -- -8% -44% print \$out 30319/s 10% 8% -- -39% perlio 49967/s 81% 78% 65% -- need Attached there is script that was used to produce those results. -- Jakub Narebski Poland
#!/usr/bin/perl use strict; use warnings; use Test::More; use Time::HiRes; use Benchmark qw(:all :hireswallclock); use Data::Dumper; use PerlIO::Util; my $chunk = "test\n"; my $lorem = <<'EOF'; Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur? EOF sub capture_perlio { my $data = ''; *STDOUT->push_layer(scalar => \$data); print $chunk; *STDOUT->pop_layer(); return $data; } my $out = \*STDOUT; sub capture_filehandle { my $data = ''; open my $data_fh, '>', \$data; $out = $data_fh; print $out $chunk; $out = \*STDOUT; close $data_fh; return $data; } sub output { print $out @_; } sub capture_indirection { my $data = ''; open my $data_fh, '>', \$data; $out = $data_fh; output $chunk; $out = \*STDOUT; close $data_fh; return $data; } sub capture_tied { my $data = ''; tie *STDOUT, 'CatchSTDOUT', \$data; print $chunk; untie *STDOUT; return $data; } { package CatchSTDOUT; use strict; use warnings; use Data::Dumper; sub TIEHANDLE { my ($proto, $dataref) = @_; my $class = ref($proto) || $proto; my $self = {}; $self = bless($self, $class); $self->{'scalar'} = $dataref; return $self; } sub PRINT { my $self = shift; ${$self->{'scalar'}} .= join('',@_); } } # ---------------------------------------------------------------------- print 'capture_perlio = '.capture_perlio(); print 'capture_filehandle = '.capture_filehandle(); print 'capture_indirection = '.capture_indirection(); print 'capture_tied = '.capture_tied(); $chunk = "another test\n"; print 'capture_perlio = '.capture_perlio(); print 'capture_filehandle = '.capture_filehandle(); print 'capture_indirection = '.capture_indirection(); print 'capture_tied = '.capture_tied(); print "\n"; $chunk = $lorem; my $result = timethese(50_000, { 'perlio' => \&capture_perlio, 'print \$out' => \&capture_filehandle, 'output' => \&capture_indirection, 'tie *STDOUT' => \&capture_indirection, }); cmpthese($result); print "\n"; $result = timethese(-10, { 'perlio' => \&capture_perlio, 'print \$out' => \&capture_filehandle, 'output' => \&capture_indirection, 'tie *STDOUT' => \&capture_indirection, }); cmpthese($result); print "\n";