[jira] [Created] (IO-277) ReaderInputStream enters infinite loop when it encounters an unmappable character

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (IO-277) ReaderInputStream enters infinite loop when it encounters an unmappable character

ASF GitHub Bot (Jira)
ReaderInputStream enters infinite loop when it encounters an unmappable character
---------------------------------------------------------------------------------

                 Key: IO-277
                 URL: https://issues.apache.org/jira/browse/IO-277
             Project: Commons IO
          Issue Type: Bug
          Components: Streams/Writers
    Affects Versions: 2.0.1
            Reporter: Mike Thomas


The ReaderInputStream.read(byte[] b, int off, int len) method enters an infinite loop when its CharsetEncoder encounters an unmappable character in the input buffer.

When its CharsetEncoder encounters an unmappable character, the value of CoderResult lastCoderResult.isUnmappable() == true, and Reader.read() is not invoked on the underlying Reader ever again.

Attaching source file that reproduces this behavior.



One fix to consider is to call CharsetEncoder.onUnmappableCharacter(CodingErrorAction) in the ReaderInputStream constructor with a value other than the default CodingErrorAction.REPORT. e.g.:

public ReaderInputStream(Reader reader, Charset charset, int bufferSize) {
            this.reader = reader;
            encoder = charset.newEncoder();
            encoder.onUnmappableCharacter(CodingErrorAction.REPLACE);
...

By replacing the unmappable character with encoder's default replacement character, this effectively prevents the infinite loop from occurring. I'm not sure if that's the ideal behavior, but it seems fairly consistent with what org.apache.commons.io.output.WriterOutputStream does.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (IO-277) ReaderInputStream enters infinite loop when it encounters an unmappable character

ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/IO-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Thomas updated IO-277:
---------------------------

    Attachment: TestReaderInputStreamLoop.java

> ReaderInputStream enters infinite loop when it encounters an unmappable character
> ---------------------------------------------------------------------------------
>
>                 Key: IO-277
>                 URL: https://issues.apache.org/jira/browse/IO-277
>             Project: Commons IO
>          Issue Type: Bug
>          Components: Streams/Writers
>    Affects Versions: 2.0.1
>            Reporter: Mike Thomas
>         Attachments: TestReaderInputStreamLoop.java
>
>
> The ReaderInputStream.read(byte[] b, int off, int len) method enters an infinite loop when its CharsetEncoder encounters an unmappable character in the input buffer.
> When its CharsetEncoder encounters an unmappable character, the value of CoderResult lastCoderResult.isUnmappable() == true, and Reader.read() is not invoked on the underlying Reader ever again.
> Attaching source file that reproduces this behavior.
> One fix to consider is to call CharsetEncoder.onUnmappableCharacter(CodingErrorAction) in the ReaderInputStream constructor with a value other than the default CodingErrorAction.REPORT. e.g.:
> public ReaderInputStream(Reader reader, Charset charset, int bufferSize) {
>             this.reader = reader;
>             encoder = charset.newEncoder();
>             encoder.onUnmappableCharacter(CodingErrorAction.REPLACE);
> ...
> By replacing the unmappable character with encoder's default replacement character, this effectively prevents the infinite loop from occurring. I'm not sure if that's the ideal behavior, but it seems fairly consistent with what org.apache.commons.io.output.WriterOutputStream does.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (IO-277) ReaderInputStream enters infinite loop when it encounters an unmappable character

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/IO-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Niall Pemberton resolved IO-277.
--------------------------------

       Resolution: Fixed
    Fix Version/s: 2.1
         Assignee: Niall Pemberton

I have implemented your suggestion to replace unmappable characters so that behaviour is consistent with WriterOutputStream. Also I have added constructors to ReaderInputStream/WriterOutputStream that take CharsetEncoder/CharsetDecoder respectively so that if this is not the desired behaviour, then people can define their own. Thanks for reporting this and the test case.

> ReaderInputStream enters infinite loop when it encounters an unmappable character
> ---------------------------------------------------------------------------------
>
>                 Key: IO-277
>                 URL: https://issues.apache.org/jira/browse/IO-277
>             Project: Commons IO
>          Issue Type: Bug
>          Components: Streams/Writers
>    Affects Versions: 2.0.1
>            Reporter: Mike Thomas
>            Assignee: Niall Pemberton
>             Fix For: 2.1
>
>         Attachments: TestReaderInputStreamLoop.java
>
>
> The ReaderInputStream.read(byte[] b, int off, int len) method enters an infinite loop when its CharsetEncoder encounters an unmappable character in the input buffer.
> When its CharsetEncoder encounters an unmappable character, the value of CoderResult lastCoderResult.isUnmappable() == true, and Reader.read() is not invoked on the underlying Reader ever again.
> Attaching source file that reproduces this behavior.
> One fix to consider is to call CharsetEncoder.onUnmappableCharacter(CodingErrorAction) in the ReaderInputStream constructor with a value other than the default CodingErrorAction.REPORT. e.g.:
> public ReaderInputStream(Reader reader, Charset charset, int bufferSize) {
>             this.reader = reader;
>             encoder = charset.newEncoder();
>             encoder.onUnmappableCharacter(CodingErrorAction.REPLACE);
> ...
> By replacing the unmappable character with encoder's default replacement character, this effectively prevents the infinite loop from occurring. I'm not sure if that's the ideal behavior, but it seems fairly consistent with what org.apache.commons.io.output.WriterOutputStream does.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] [Closed] (IO-277) ReaderInputStream enters infinite loop when it encounters an unmappable character

ASF GitHub Bot (Jira)
In reply to this post by ASF GitHub Bot (Jira)

     [ https://issues.apache.org/jira/browse/IO-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary D. Gregory closed IO-277.
------------------------------


Closing, we released version 2.1.
               

> ReaderInputStream enters infinite loop when it encounters an unmappable character
> ---------------------------------------------------------------------------------
>
>                 Key: IO-277
>                 URL: https://issues.apache.org/jira/browse/IO-277
>             Project: Commons IO
>          Issue Type: Bug
>          Components: Streams/Writers
>    Affects Versions: 2.0.1
>            Reporter: Mike Thomas
>            Assignee: Niall Pemberton
>             Fix For: 2.1
>
>         Attachments: TestReaderInputStreamLoop.java
>
>
> The ReaderInputStream.read(byte[] b, int off, int len) method enters an infinite loop when its CharsetEncoder encounters an unmappable character in the input buffer.
> When its CharsetEncoder encounters an unmappable character, the value of CoderResult lastCoderResult.isUnmappable() == true, and Reader.read() is not invoked on the underlying Reader ever again.
> Attaching source file that reproduces this behavior.
> One fix to consider is to call CharsetEncoder.onUnmappableCharacter(CodingErrorAction) in the ReaderInputStream constructor with a value other than the default CodingErrorAction.REPORT. e.g.:
> public ReaderInputStream(Reader reader, Charset charset, int bufferSize) {
>             this.reader = reader;
>             encoder = charset.newEncoder();
>             encoder.onUnmappableCharacter(CodingErrorAction.REPLACE);
> ...
> By replacing the unmappable character with encoder's default replacement character, this effectively prevents the infinite loop from occurring. I'm not sure if that's the ideal behavior, but it seems fairly consistent with what org.apache.commons.io.output.WriterOutputStream does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira