[jira] [Created] (COMPRESS-212) TarArchiveEntry getName() returns wrongly encoded name even when you set encoding to TarArchiveInputStream

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (COMPRESS-212) TarArchiveEntry getName() returns wrongly encoded name even when you set encoding to TarArchiveInputStream

Walter Laan (Jira)
Woo Ju Shin created COMPRESS-212:
------------------------------------

             Summary: TarArchiveEntry getName() returns wrongly encoded name even when you set encoding to TarArchiveInputStream
                 Key: COMPRESS-212
                 URL: https://issues.apache.org/jira/browse/COMPRESS-212
             Project: Commons Compress
          Issue Type: Bug
    Affects Versions: 1.4.1
         Environment: Red Hat Enterprise Linux, MS Windows 7
            Reporter: Woo Ju Shin
            Priority: Minor


I have two file systems. One is Red Hat Linux, one is MS Windows.
I created a *.tgz file in Red Hat Linux and tried to decompress it in MS Windows using Commons Compress.
The default system encoding are different. UTF-8 in Red Hat Linux and CP949 in MS Windows.
It seems that the file name encoding follows the default encoding even though when I use the following to untar it.

FileInputStream fis = new FileInputStream(new File(*.tgz));
TarArchiveInputStream zis = new TarArchiveInputStream(new BufferedInputStream(fis),encodingOfRedHatLinux);

while ((entry = (TarArchiveEntry)zis.getNextEntry()) != null)
{
entry.getName(); // filename is not UTF-8 it is encoded in CP949 and so the filename isn't consistent
}

By referring to this

    /**
     * Constructor for TarInputStream.
     * @param is the input stream to use
     * @param encoding name of the encoding to use for file names
     * @since Commons Compress 1.4
     */
    public TarArchiveInputStream(InputStream is, String encoding) {
        this(is, TarBuffer.DEFAULT_BLKSIZE, TarBuffer.DEFAULT_RCDSIZE, encoding);
    }

encoding should be used for file names.
But actually this doesn't seem to work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira