[PATCH 11/24] gitweb/lib - capture output directly to cache entry file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This commit makes it possible to capture output redirecting STDOUT to
cache entry file; this way we don't need to store whole output in
memory in order to cache it.

The ->capture() method in GitwebCache::Capture::Simple is extended so
it signature is now ->capture($code[, $to]).  This means that it
accepts optional $to parameter, which specifies either filehandle open
for writing, or a file name, which would be used to capture output
instead of in-memory file.  The ->capture_start() and ->capture_stop()
methods that are used by ->capture() also got updated.

GitwebCache::SimpleFileCache acquired new ->compute_fh($key, $code_fh)
method.  Instead of passing and returning data like ->compute(), it
passes and returns filehandles and filenames.  While $code parameter
in ->compute($key, $code) call is expected to generate and return data
to be cached, the $code_fh parameter in ->compute_fh($key, $code_fh)
call is expected to print generated data to filehandle provided as its
parameter.  Instead of returning data (either from cache or generated)
like ->compute(), the new ->compute_fh() method returns filehandle and
filename to read data from.

The cache_output subroutine in GitwebCache::CacheOutput now uses
->compute_fh method if cache supports it.  The contents of file that
is returned by ->compute_fh is dumped to STDOUT using File::Copy::copy.

Note that ->compute_fh interface is non-standard, and is not supported
by any caching module known to author.  Closest to it is ->handle($key)
method in the Cache[1] interface (proxied to ->entry($key)->handle()).

[1]: http://p3rl.org/Cache


Added tests for capturing to file in t9503, and of ->compute_fh method
in t9504.

Inspired-by-idea-by: John 'Warthog9' Hawley <warthog9@xxxxxxxxxx>
Signed-off-by: Jakub Narebski <jnareb@xxxxxxxxx>
---
This patch was inspired of behavior of cache and capturing output in
"Gitweb caching v7" series (or at least my v7.4 minimal rewrite).  It
was not present in previous version of this series... this means that
it might be less well thought out.


The major difference from the way it is done in "Gitweb caching v7" by
J.H. is that it tries to keep those three areas: caching, capturing
and integration with gitweb (caching captured output) separate.  As it
was said earlier, it allows to "unit test" those modules/packages
separately.  I think it also makes this solution easier to maintain.

As to why I didn't follow Cache interface, i.e. ->entry($key)->handle():
it would require to create at least two additional classes, a class for
cache entry to avoid race condition between checking availability and
getting ->handle(), and a IO::Handle / IO::File derivative which allows to
specify code ran when closing file - for locking, and for atomic write
(unlock and rename file to final destination).  That's why I went with
->compute_fh interface.

Note that dumping generated output or output from cache to STDOUT is done
using File::Copy: this too was inspired by code in "Gitweb caching v7" by
J.H.  (File::Copy was first released with perl 5.002, and we require for
gitweb at least 5.008).

 gitweb/lib/GitwebCache/CacheOutput.pm     |   17 ++++++++
 gitweb/lib/GitwebCache/Capture/Simple.pm  |   28 ++++++++++---
 gitweb/lib/GitwebCache/SimpleFileCache.pm |   59 +++++++++++++++++++++++++++++
 t/t9503/test_cache_interface.pl           |   32 +++++++++++++++
 t/t9504/test_capture_interface.pl         |   17 ++++++++
 5 files changed, 146 insertions(+), 7 deletions(-)

diff --git a/gitweb/lib/GitwebCache/CacheOutput.pm b/gitweb/lib/GitwebCache/CacheOutput.pm
index 4a96ac9..7aeb895 100644
--- a/gitweb/lib/GitwebCache/CacheOutput.pm
+++ b/gitweb/lib/GitwebCache/CacheOutput.pm
@@ -17,6 +17,8 @@ package GitwebCache::CacheOutput;
 use strict;
 use warnings;
 
+use File::Copy;
+
 use Exporter qw(import);
 our @EXPORT      = qw(cache_output capture_stop);
 our %EXPORT_TAGS = (all => [ @EXPORT ]);
@@ -38,6 +40,21 @@ sub cache_output {
 
 	$capture = setup_capture($capture);
 
+	if ($cache->can('compute_fh')) {
+		my ($fh, $filename) = $cache->compute_fh($key, sub {
+			my $fh = shift;
+			$capture->capture($code, $fh);
+		});
+
+		if (defined $fh) {
+			binmode $fh, ':raw';
+			binmode STDOUT, ':raw';
+			copy($fh, \*STDOUT);
+		}
+
+		return;
+	}
+
 	my $data;
 	if ($cache->can('compute')) {
 		$data = cache_output_compute($cache, $capture, $key, $code);
diff --git a/gitweb/lib/GitwebCache/Capture/Simple.pm b/gitweb/lib/GitwebCache/Capture/Simple.pm
index 3585e58..74dd240 100644
--- a/gitweb/lib/GitwebCache/Capture/Simple.pm
+++ b/gitweb/lib/GitwebCache/Capture/Simple.pm
@@ -18,6 +18,7 @@ use strict;
 use warnings;
 
 use PerlIO;
+use Symbol qw(qualify_to_ref);
 
 # Constructor
 sub new {
@@ -32,7 +33,7 @@ sub new {
 sub capture {
 	my ($self, $code) = @_;
 
-	$self->capture_start();
+	$self->capture_start(@_[2..$#_]); # pass rest of params
 	$code->();
 	return $self->capture_stop();
 }
@@ -42,6 +43,7 @@ sub capture {
 # Start capturing data (STDOUT)
 sub capture_start {
 	my $self = shift;
+	my $to = shift;
 
 	# save copy of real STDOUT via duplicating it
 	my @layers = PerlIO::get_layers(\*STDOUT);
@@ -51,11 +53,23 @@ sub capture_start {
 	# close STDOUT, so that it isn't used anymode (to have it fd0)
 	close STDOUT;
 
-	# reopen STDOUT as in-memory file
-	$self->{'data'} = '';
-	unless (open STDOUT, '>', \$self->{'data'}) {
-		open STDOUT, '>&', fileno($self->{'orig_stdout'});
-		die "Couldn't reopen STDOUT as in-memory file for capture: $!";
+	if (defined $to) {
+		$self->{'to'} = $to;
+		if (defined fileno(qualify_to_ref($to))) {
+			# if $to is filehandle, redirect
+			open STDOUT, '>&', fileno($to);
+		} elsif (! ref($to)) {
+			# if $to is name of file, open it
+			open STDOUT, '>', $to;
+		}
+
+	} else {
+		# reopen STDOUT as in-memory file
+		$self->{'data'} = '';
+		unless (open STDOUT, '>', \$self->{'data'}) {
+			open STDOUT, '>&', fileno($self->{'orig_stdout'});
+			die "Couldn't reopen STDOUT as in-memory file for capture: $!";
+		}
 	}
 	_relayer(\*STDOUT, \@layers);
 
@@ -76,7 +90,7 @@ sub capture_stop {
 	open STDOUT, '>&', fileno($self->{'orig_stdout'});
 	_relayer(\*STDOUT, \@layers);
 
-	return $self->{'data'};
+	return exists $self->{'to'} ? $self->{'to'} : $self->{'data'};
 }
 
 # taken from Capture::Tiny by David Golden, Apache License 2.0
diff --git a/gitweb/lib/GitwebCache/SimpleFileCache.pm b/gitweb/lib/GitwebCache/SimpleFileCache.pm
index 581a574..12af44f 100644
--- a/gitweb/lib/GitwebCache/SimpleFileCache.pm
+++ b/gitweb/lib/GitwebCache/SimpleFileCache.pm
@@ -438,6 +438,65 @@ sub compute {
 	return $data;
 }
 
+# ......................................................................
+# nonstandard interface methods
+
+sub get_fh {
+	my ($self, $key) = @_;
+
+	my $path = $self->path_to_key($key);
+	return unless (defined $path);
+
+	return unless ($self->is_valid($key));
+
+	open my $fh, '<', $path or return;
+	return ($fh, $path);
+}
+
+sub set_coderef_fh {
+	my ($self, $key, $code) = @_;
+
+	my $path = $self->path_to_key($key, \my $dir);
+	return unless (defined $path && defined $dir);
+
+	# ensure that directory leading to cache file exists
+	if (!-d $dir) {
+		# mkpath will croak()/die() if there is an error
+		mkpath($dir, 0, 0777);
+	}
+
+	# generate a temporary file
+	my ($fh, $tempfile) = _tempfile_to_path($path, $dir);
+
+	# code writes to filehandle or file
+	$code->($fh, $tempfile);
+
+	close $fh;
+	rename($tempfile, $path)
+		or die "Couldn't rename temporary file '$tempfile' to '$path': $!";
+
+	open $fh, '<', $path or return;
+	return ($fh, $path);
+}
+
+# ($fh, $filename) = $cache->compute_fh($key, $code);
+#
+# Combines the get and set operations in a single call.  Attempts to
+# get $key; if successful, returns the filehandle it can be read from.
+# Otherwise, calls $code passing filehandle to write to as a
+# parameter; contents of this file is then used as the new value for
+# $key; returns filehandle from which one can read newly generated data.
+sub compute_fh {
+	my ($self, $key, $code) = @_;
+
+	my ($fh, $filename) = $self->get_fh($key);
+	if (!defined $fh) {
+		($fh, $filename) = $self->set_coderef_fh($key, $code);
+	}
+
+	return ($fh, $filename);
+}
+
 1;
 __END__
 # end of package GitwebCache::SimpleFileCache;
diff --git a/t/t9503/test_cache_interface.pl b/t/t9503/test_cache_interface.pl
index 9513043..0b4b09f 100755
--- a/t/t9503/test_cache_interface.pl
+++ b/t/t9503/test_cache_interface.pl
@@ -82,6 +82,38 @@ subtest 'CHI interface' => sub {
 	done_testing();
 };
 
+# Test the getting and setting of a cached value
+# (compute_fh interface)
+#
+$call_count = 0;
+sub write_value {
+	my $fh = shift;
+	$call_count++;
+	print {$fh} $value;
+}
+sub compute_fh_output {
+	my ($cache, $key, $code_fh) = @_;
+
+	my ($fh, $filename) = $cache->compute_fh($key, $code_fh);
+
+	local $/ = undef;
+	return <$fh>;
+}
+subtest 'compute_fh interface' => sub {
+	can_ok($cache, qw(compute_fh));
+
+	$cache->remove($key);
+	is(compute_fh_output($cache, $key, \&write_value), $value,
+	   "compute_fh 1st time (set) returns '$value'");
+	is(compute_fh_output($cache, $key, \&write_value), $value,
+	   "compute_fh 2nd time (get) returns '$value'");
+	is(compute_fh_output($cache, $key, \&write_value), $value,
+	   "compute_fh 3rd time (get) returns '$value'");
+	cmp_ok($call_count, '==', 1, 'write_value() is called once from compute_fh');
+
+	done_testing();
+};
+
 # Test cache expiration
 #
 subtest 'cache expiration' => sub {
diff --git a/t/t9504/test_capture_interface.pl b/t/t9504/test_capture_interface.pl
index 47ab804..26c9303 100755
--- a/t/t9504/test_capture_interface.pl
+++ b/t/t9504/test_capture_interface.pl
@@ -6,6 +6,7 @@ use strict;
 use utf8;
 
 use Test::More;
+use File::Compare;
 
 # test source version
 use lib $ENV{GITWEBLIBDIR} || "$ENV{GIT_BUILD_DIR}/gitweb/lib";
@@ -84,6 +85,22 @@ SKIP: {
 	is($captured, '', "doesn't print while capturing");
 }
 
+# Test capturing to file
+#
+my $test_data = 'Capture this';
+open my $fh, '>', 'expected' or die "Couldn't open file for writing: $!";
+print {$fh} $test_data;
+close $fh;
+
+$capture->capture(sub { print $test_data; }, 'actual');
+cmp_ok(compare('expected', 'actual'), '==', 0, 'capturing to file via filename');
+
+open my $fh, '>', 'actual' or die "Couldn't open file for writing: $!";
+$capture->capture(sub { print $test_data; }, $fh);
+close $fh;
+cmp_ok(compare('expected', 'actual'), '==', 0, 'capturing to file via filehandle');
+
+
 done_testing();
 
 # Local Variables:
-- 
1.7.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]