The test code uses UTF-8 to convert the input string to bytes:
inputBuffer = input.getBytes("UTF-8");
but then uses the default encoding to convert the bytes back to a string:
value = bos.toString(); // <== default used here
System.out.println("Decompressed String :-"+value);
Changing this to
value = bos.toString("UTF-8");
fixes the issue.
As the source file is actually in ISO-8859-1 the test should actually be using that for the byte<=>String conversions. This also fixes the issue.
> Issue in populating junk characters
> Key: CODEC-164
> URL: https://issues.apache.org/jira/browse/CODEC-164 > Project: Commons Codec
> Issue Type: Bug
> Affects Versions: 1.7
> Environment: Windows XP, Weblogic, Oracle
> Reporter: Priyesh Jain
> Attachments: Base64JunkTest.java
> Original Estimate: 168h
> Remaining Estimate: 168h
> While decompressing the compressed String (which contains special characters like " ç or õ or ã " )with API “org.apache.commons.codec.binary.Base64” it is showing garbage values.
> While using Base64 API, we have used default encoding type as “UTF-8”.
> I have tried this issue with commons-codec-v1.3.jar and commons-codec-v1.7.jar also.
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira