[jira] [Commented] (COMPRESS-206) TarArchiveOutputStream sometimes writes garbage beyond the end of the archive

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (COMPRESS-206) TarArchiveOutputStream sometimes writes garbage beyond the end of the archive

Lars Bruun-Hansen (Jira)

    [ https://issues.apache.org/jira/browse/COMPRESS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541349#comment-13541349 ]

Stefan Bodewig commented on COMPRESS-206:

I didn't mean to imply this was a duplicate of 202 or anything like that, but rather that your patch provides one alternative to fix COMPRESS-202 beyond "just" documenting the behavior.  We'll need to discuss on the dev list how we want to solve it, but for the sake of getting more opinions I'll wait a few days until people are back from their holiday time-outs.

Given tar record reading may be willing to consume non-tar inputs, we may need to be more careful there.  I'll need to look into the tar dialect documentations a bit more to see what we might need to accept as part of the archive once the first EOF-record has been found.

> TarArchiveOutputStream sometimes writes garbage beyond the end of the archive
> -----------------------------------------------------------------------------
>                 Key: COMPRESS-206
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-206
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Compressors
>    Affects Versions: 1.0, 1.4.1
>         Environment: Linux x86
>            Reporter: Peter De Maeyer
>             Fix For: 1.5
>         Attachments: COMPRESS-206.patch
> For some combinations of file lengths, the archive created by TarArchiveOutputStream writes garbage beyond the end of the TAR stream. TarArchiveInputStream can still read the stream without problems, but it does not read beyond the garbage. This is problematic for my use case because I write a checksum _after_ the TAR content. If I then try to read the checksum back, I read garbage instead.
> Functional impact:
> * TarArchiveInputStream is asymmetrical with respect to TarArchiveOutputStream, in the sense that TarArchiveInputStream does not read everything that was written by TarArchiveOutputStream.
> * The content is unnecessarily large. The garbage is totally unnecessarily large: ~10K overhead compared to Linux command-line tar.
> This symptom is remarkably similar to #COMPRESS-81, which is supposedly fixed since 1.1. Except for the fact that this issue still exists... I've tested this with 1.0 and 1.4.1.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira