[jira] [Updated] (CODEC-166) Base64 could be faster

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Updated] (CODEC-166) Base64 could be faster

Gary D. Gregory (Jira)

     [ https://issues.apache.org/jira/browse/CODEC-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julius Davies updated CODEC-166:

    Attachment: CODEC-166.draft.patch

Here's one way of doing it:  retrofit MiGBase64.java so that it becomes our back-end on all the byte[] and String based methods.

All the unit tests still pass!  :-)

Of course this patch still needs some work to clean up documentation and code style, but I thought I'd put it out there for comment.

Here's the benchmark run now:

  LARGE DATA new byte[12345]

encode 471.0 MB/s    decode 158.0 MB/s
encode 495.0 MB/s    decode 155.0 MB/s

encode 497.0 MB/s    decode 215.0 MB/s
encode 510.0 MB/s    decode 211.0 MB/s

Apache Commons Codec...
encode 556.0 MB/s    decode 224.0 MB/s
encode 553.0 MB/s    decode 226.0 MB/s

Encode speed-up about 350% and decode speed-up about 50%.

> Base64 could be faster
> ----------------------
>                 Key: CODEC-166
>                 URL: https://issues.apache.org/jira/browse/CODEC-166
>             Project: Commons Codec
>          Issue Type: Bug
>    Affects Versions: 1.7
>            Reporter: Julius Davies
>            Assignee: Julius Davies
>         Attachments: base64bench.zip, CODEC-166.draft.patch
> Our Base64 consistently performs 3 times slower compared to MiGBase64 and iHarder in the byte[] and String encode() methods.
> We are pretty good on decode(), though a little slower (approx. 33% slower) than MiGBase64.
> We always win in the Streaming methods (MiGBase64 doesn't do streaming).  Yay!  :-) :-) :-)
> I put together a benchmark.  Here's a typical run:
> {noformat}
>   LARGE DATA new byte[12345]
> iHarder...
> encode 486.0 MB/s    decode 158.0 MB/s
> encode 491.0 MB/s    decode 148.0 MB/s
> MiGBase64...
> encode 499.0 MB/s    decode 222.0 MB/s
> encode 493.0 MB/s    decode 226.0 MB/s
> Apache Commons Codec...
> encode 142.0 MB/s    decode 146.0 MB/s
> encode 138.0 MB/s    decode 150.0 MB/s
> {noformat}
> I believe the main approach we can consider to improve performance is to avoid array copies at all costs.   MiGBase64 even counts the number of valid Base64 characters ahead of time on decode() to precalculate the result's size and avoid any array copying!
> I suspect this will mean writing out separate execution paths for the String and byte[] methods, and keeping them out of the streaming logic, since the streaming logic is founded on array copy.
> Unfortunately this means we will diminish internal reuse of the streaming implementation, but I think it's the only way to improve performance, if we want to.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira