Performing GZIP incremental compression/decompression on the fly

Umberto Salsi <salsi@xxxxxxxxxxxx> · Sun, 02 Nov 2014 22:08:29 +0100

I'm trying to perform GZIP compression and decompression "on the
fly", that is without files on disk involved, ideally something that
incrementally writes on a string buffer in memory from which compressed
(decompressed) data can be read chunk by chunk for further processing:

$gz = new GZIPWRITER();
do {
    $gz->write("...chunk of data to compress...");
	$compressed = $gz->read();
	do_something_else($compressed);
} while(still data to write);
$gz->close();
// discharge last chunk:
$compressed = $gz->read();
do_something_else($compressed);

The (fictional) class GZIPWRITER used above should provide a writer
method that compresses and stores the data in its internal buffer,
and also provides a reader method that retrieves the compressed data
from that buffer. All the processing can then be performed feeding this
object with small chunks of data that can be readily be extracted for
further processing.

Looking at the standard zlib library, I did not found anything that
fit this need: gzopen() requires a file, while gzencode()/gzdecode()
both operates on a string that must contain the whole data.

An attempt I made is using the php_user_filter class. The idea, here,
is that the filter() method captures a chunk of compressed data; in this
code sample the captured chunk of compressed data is simply sent to stdout
just to see the result:

<?php

// Filter class:
class EchoGZIP extends php_user_filter {

	function filter($in, $out, &$consumed, $closing) {
		do {
			if( ! stream_bucket_make_writeable($in) )
				break;
			//stream_bucket_append($out, $bucket);
			echo "[Captured: ", urlencode($bucket->data), "]\n";
			$consumed += $bucket->datalen;
		} while(true);
		return PSFS_PASS_ON;
	}

}

// Register filter class:
if( ! stream_filter_register("EchoGZIP", "EchoGZIP") )
	die("Failed to register filter");

// Write the compressed file:
$gz = gzopen("/dev/null", "wb9");
stream_filter_append($gz, "EchoGZIP", STREAM_FILTER_WRITE, null);
gzwrite($gz, "xxx"); // write some data...
gzwrite($gz, "yyy"); / even more data...
// ...and so on.
gzclose($gz);
?>

Note how the destination file is set to "/dev/null", which should
reduce file access overhead to a minimum; moreover, the call to
stream_bucket_append() can be commented out (writing to /dev/null is
useless).
Unfortunately, the code above does not work, as it displays:

[Captured: xxx]
[Captured: yyy]

Apparently, the filter() method only captures the uncompressed plain
data stream just *before* it enters the GZIP algorith, and not *after*.

Is there any other way to perform something like this?
Or the code above can be fixed in some way?

Thanks,

Regard,
 ___ 
/_|_\  Umberto Salsi
\/_\/  www.icosaedro.it

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php