[Compress - GZIP] Is is possible to use the --rsyncable option?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Compress - GZIP] Is is possible to use the --rsyncable option?

Daron Clay
I have an application using the existing tar and gzip output streams.  I
would like to be able to specify the equivalent of gzip --rsyncable when
creating the tgz.

Is this possible?  I didn't see a way to do it in the doc.

--Daron

--
------------------------------------------------------------------------
*Daron Clay*
*ZeroMachine.net*
Software Developer
E: [hidden email]
P: 970-769-4805 (USA)
http://www.zeromachine.net
------------------------------------------------------------------------
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Compress - GZIP] Is is possible to use the --rsyncable option?

Stefan Bodewig
On 2017-08-08, Daron Clay wrote:

> I have an application using the existing tar and gzip output streams.
> I would like to be able to specify the equivalent of gzip --rsyncable
> when creating the tgz.

> Is this possible?  I didn't see a way to do it in the doc.

There is no way to do this, currently.

I've had a look at how gzip implements --rsyncable[1] and right now we
don't get as much control over the stream as the C code does as we use
java.util.Deflater and it doesn't provide any option for this.

I think it should be possible to achieve something similar to what
--rsyncable does by using FULL_FLUSH as flush value to deflate() at
certain intervals (like every 8kB or whenever the user code knows it has
finished adding a tar entry's content) but I may be misunderstanding
what FULL_FLUSH does in detail (haven't checked zlib, yet). In any case,
this is nothing you could do from user code. It would require changes to
GzipCompressorOutputStream.

What may help is using framed lz4 or snappy (snappy's implementation
should be faster than lz4) rather that gz compressor streams. They reset
their internal state after a fixed block size so changes that only apply
to a certain block won't modify subsequent blocks. Unfortunately there
is no way to force a new block from user code for either format. Ideally
you'd tell it to start a fresh block whenever you've finished an entry's
content.

zips should be easier for rsync than tar.gzs as the each file's content
is compressed separately. Finally there is the obvious solution of not
compressing the tar at all, but you probably knew that.

Stefan

[1] http://www.samba.org/netfilter/diary/gzip.rsync.patch

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...