[jira] [Updated] (CODEC-63) Implement Nysiis

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Updated] (CODEC-63) Implement Nysiis

Gilles (Jira)

     [ https://issues.apache.org/jira/browse/CODEC-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Neidhart updated CODEC-63:

    Attachment: CODEC-63-reworked.tar

It is a pity, that such a feature request with attached patch stays open for such a long time. I have reworked the patch, and added more unit tests (in comparison with http://www.dropby.com/NYSIIS.html).

The implementation at dropby has some bugs tough, which have been addressed in this implementation:

 * no key character should be appended if it is the same as the previous one, this applies also for the first key char, so it should not be valid to have two times the same char at the beginning of the result (e.g. SSNAT). This would btw. also circumvent the idea of the code to map similar names to a unified one.

 * in the dropby example, certain names are wrongly transcoded, e.g. KOEHN -> C, but should be CAN

Note: the implementation is optimized for readability and not for performance.

> Implement Nysiis
> ----------------
>                 Key: CODEC-63
>                 URL: https://issues.apache.org/jira/browse/CODEC-63
>             Project: Commons Codec
>          Issue Type: New Feature
>    Affects Versions: 1.x
>            Reporter: Henri Yandell
>             Fix For: 1.x
>         Attachments: CODEC-63-reworked.tar, Nysiis.patch
> http://en.wikipedia.org/wiki/NYSIIS

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira