[jira] [Commented] (CODEC-63) Implement NYSIIS

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (CODEC-63) Implement NYSIIS

ASF GitHub Bot (Jira)

    [ https://issues.apache.org/jira/browse/CODEC-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224932#comment-13224932 ]

Gary D. Gregory commented on CODEC-63:
--------------------------------------

Hi Thomas,

Thank you for providing the refreshed patch. Can you do a little more digging on this one please (see below)?

Please find attached my version which adds more tests, some failing that are commented out. I also added a boolean to do 'true' Nysiis which always truncates to length 6.

This needs more work to figure out why the tests fail.

I used the data here: http://www.dropby.com/NYSIISTextStrings.html

Note the 'Original' and 'Modified' columns.

Our code sometimes matches one column, sometimes the other.

I imagine that by no means is the dropby data is canonical but we should document clearly in the tests and in the code what is what.

Right now, I do not know if our implementation is correct or buggy for the commented out tests.

Granted I've not spent much time on this.

The unit tests do not get 100% code coverage either, which should be a goal for a new codec. 100% coverage does not guarantee correct implementation of course, but it's a start to make sure we at least test what we have.

At least, the new tests brought the coverage from 98%/93% line/branch to 100%/94%, so we've got that going for us ;)


Gary
               

> Implement NYSIIS
> ----------------
>
>                 Key: CODEC-63
>                 URL: https://issues.apache.org/jira/browse/CODEC-63
>             Project: Commons Codec
>          Issue Type: New Feature
>    Affects Versions: 1.x
>            Reporter: Henri Yandell
>             Fix For: 1.x
>
>         Attachments: CODEC-63-reworked.tar, Nysiis.patch
>
>
> http://en.wikipedia.org/wiki/NYSIIS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira