[TEXT] TEXT-10 A more complex Levenshtein distance

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[TEXT] TEXT-10 A more complex Levenshtein distance

don jeba
Hello,         I am new to open source contribution.
Lately I gave a pull request to common-text. I dont know whether I am missing any procedure to contribute to common-text. Kindly correct me so that I can do the necessary so that someone will review and comment on my code.
Jira TEXT-10

https://github.com/apache/commons-text/pull/6

Kindly advise.
Thank you,
Regards,Don Jeba.
Reply | Threaded
Open this post in threaded view
|

Re: [TEXT] TEXT-10 A more complex Levenshtein distance

Bruno P. Kinoshita-3
Hi Don Jeba,

I will have a look at your implementation to compare with a recent improvement in [lang]
https://github.com/apache/commons-lang/blob/78134f6b3f1facd019e604d2cd000c4ce7cf9a0a/src/main/java/org/apache/commons/lang3/StringUtils.java#L7710

Instead of keeping a matrix (or even only two rows) the current version in StringUtils keeps just one array and a couple of helper temporary variables.

Not sure if we can re-use it, adding the new features in TEXT-10 (i.e. insert/delete/substitution counts), but if possible that'd be better.

Cheers
Bruno




----- Original Message -----

> From: don jeba <[hidden email]>
> To: Commons Developers List <[hidden email]>
> Sent: Monday, 17 October 2016 1:51 AM
> Subject: [TEXT] TEXT-10 A more complex Levenshtein distance
>
> Hello,         I am new to open source contribution.
> Lately I gave a pull request to common-text. I dont know whether I am missing
> any procedure to contribute to common-text. Kindly correct me so that I can do
> the necessary so that someone will review and comment on my code.
> Jira TEXT-10
>
> https://github.com/apache/commons-text/pull/6
>
> Kindly advise.
> Thank you,
> Regards,Don Jeba.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [TEXT] TEXT-10 A more complex Levenshtein distance

don jeba
Hi Bruno,
Thank you for the review.
The one in lang gets only the distance (insert+delete+substitute). To get the individual values, (my understanding is), I need to iterate the diagonal elements in matrix, also need to know the elements at the left and top of the diagonal element to find whether its an insertion or deletion or substitution. Considering this I have used 2 dimensional array.
Regards,
Don Jeba.

      From: Bruno P. Kinoshita <[hidden email]>
 To: Commons Developers List <[hidden email]>; don jeba <[hidden email]>
 Sent: Monday, 17 October 2016 7:37 AM
 Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
   
Hi Don Jeba,

I will have a look at your implementation to compare with a recent improvement in [lang]
https://github.com/apache/commons-lang/blob/78134f6b3f1facd019e604d2cd000c4ce7cf9a0a/src/main/java/org/apache/commons/lang3/StringUtils.java#L7710

Instead of keeping a matrix (or even only two rows) the current version in StringUtils keeps just one array and a couple of helper temporary variables.

Not sure if we can re-use it, adding the new features in TEXT-10 (i.e. insert/delete/substitution counts), but if possible that'd be better.

Cheers
Bruno




----- Original Message -----

> From: don jeba <[hidden email]>
> To: Commons Developers List <[hidden email]>
> Sent: Monday, 17 October 2016 1:51 AM
> Subject: [TEXT] TEXT-10 A more complex Levenshtein distance
>
> Hello,        I am new to open source contribution.
> Lately I gave a pull request to common-text. I dont know whether I am missing
> any procedure to contribute to common-text. Kindly correct me so that I can do
> the necessary so that someone will review and comment on my code.
> Jira TEXT-10
>
> https://github.com/apache/commons-text/pull/6
>
> Kindly advise.
> Thank you,
> Regards,Don Jeba.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



   
Reply | Threaded
Open this post in threaded view
|

Re: [TEXT] TEXT-10 A more complex Levenshtein distance

Bruno P. Kinoshita-3
Hi Don,

Could you take a look at the tabs/spaces in the pull request, please?

In the meantime, I'll play with the code to see if

i) we can use the two-array algorithm instead; where we analyse two lines each time, instead of keeping the whole matrix. Will probably have to compute the LevenshteinResults on-the-fly for that, instead of in a separate method.

ii) check if it would be doable to use the one array + temporary variables algo instead, and also compute the insert+delete+substitute on the fly.

Just need a couple of hours to play with the code and run your tests to make sure it is working :)

Cheers
Bruno



>________________________________
> From: don jeba <[hidden email]>
>To: Commons Developers List <[hidden email]>; Bruno P. Kinoshita <[hidden email]>
>Sent: Monday, 17 October 2016 11:44 PM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>
>
>Hi Bruno,
>Thank you for the review.
>The one in lang gets only the distance (insert+delete+substitute). To get the individual values, (my understanding is), I need to iterate the diagonal elements in matrix, also need to know the elements at the left and top of the diagonal element to find whether its an insertion or deletion or substitution. Considering this I have used 2 dimensional array.
>Regards,
>Don Jeba.
>
>      From: Bruno P. Kinoshita <[hidden email]>
>
>To: Commons Developers List <[hidden email]>; don jeba <[hidden email]>
>Sent: Monday, 17 October 2016 7:37 AM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>  
>Hi Don Jeba,
>
>I will have a look at your implementation to compare with a recent improvement in [lang]
>https://github.com/apache/commons-lang/blob/78134f6b3f1facd019e604d2cd000c4ce7cf9a0a/src/main/java/org/apache/commons/lang3/StringUtils.java#L7710
>
>Instead of keeping a matrix (or even only two rows) the current version in StringUtils keeps just one array and a couple of helper temporary variables.
>
>Not sure if we can re-use it, adding the new features in TEXT-10 (i.e. insert/delete/substitution counts), but if possible that'd be better.
>
>Cheers
>Bruno
>
>
>
>
>----- Original Message -----
>> From: don jeba <[hidden email]>
>> To: Commons Developers List <[hidden email]>
>> Sent: Monday, 17 October 2016 1:51 AM
>> Subject: [TEXT] TEXT-10 A more complex Levenshtein distance
>>
>> Hello,        I am new to open source contribution.
>> Lately I gave a pull request to common-text. I dont know whether I am missing
>> any procedure to contribute to common-text. Kindly correct me so that I can do
>> the necessary so that someone will review and comment on my code.
>> Jira TEXT-10
>>
>> https://github.com/apache/commons-text/pull/6
>>
>> Kindly advise.
>> Thank you,
>> Regards,Don Jeba.
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [hidden email]
>For additional commands, e-mail: [hidden email]
>
>
>
>
>  
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [TEXT] TEXT-10 A more complex Levenshtein distance

don jeba
Hi Bruno,          Now the comment on formatting is fixed. I will be careful on this the next time.
My understanding is, we need to traverse from the diagonal element [right bottom corner] to find whether whether its insert or delete or substitute. I might be wrong.
Regarding using 1D array instead of 2D array, I think it should be possible. I will also give a try.
Thank you,
Regards,Don Jeba.

      From: Bruno P. Kinoshita <[hidden email]>
 To: Commons Developers List <[hidden email]>; don jeba <[hidden email]>
 Sent: Monday, 24 October 2016 8:15 AM
 Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
   
Hi Don,

Could you take a look at the tabs/spaces in the pull request, please?

In the meantime, I'll play with the code to see if

i) we can use the two-array algorithm instead; where we analyse two lines each time, instead of keeping the whole matrix. Will probably have to compute the LevenshteinResults on-the-fly for that, instead of in a separate method.

ii) check if it would be doable to use the one array + temporary variables algo instead, and also compute the insert+delete+substitute on the fly.

Just need a couple of hours to play with the code and run your tests to make sure it is working :)

Cheers
Bruno



>________________________________
> From: don jeba <[hidden email]>
>To: Commons Developers List <[hidden email]>; Bruno P. Kinoshita <[hidden email]>
>Sent: Monday, 17 October 2016 11:44 PM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>
>
>Hi Bruno,
>Thank you for the review.
>The one in lang gets only the distance (insert+delete+substitute). To get the individual values, (my understanding is), I need to iterate the diagonal elements in matrix, also need to know the elements at the left and top of the diagonal element to find whether its an insertion or deletion or substitution. Considering this I have used 2 dimensional array.
>Regards,
>Don Jeba.
>
>      From: Bruno P. Kinoshita <[hidden email]>
>
>To: Commons Developers List <[hidden email]>; don jeba <[hidden email]>
>Sent: Monday, 17 October 2016 7:37 AM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance

>Hi Don Jeba,
>
>I will have a look at your implementation to compare with a recent improvement in [lang]
>https://github.com/apache/commons-lang/blob/78134f6b3f1facd019e604d2cd000c4ce7cf9a0a/src/main/java/org/apache/commons/lang3/StringUtils.java#L7710
>
>Instead of keeping a matrix (or even only two rows) the current version in StringUtils keeps just one array and a couple of helper temporary variables.
>
>Not sure if we can re-use it, adding the new features in TEXT-10 (i.e. insert/delete/substitution counts), but if possible that'd be better.
>
>Cheers
>Bruno
>
>
>
>
>----- Original Message -----
>> From: don jeba <[hidden email]>
>> To: Commons Developers List <[hidden email]>
>> Sent: Monday, 17 October 2016 1:51 AM
>> Subject: [TEXT] TEXT-10 A more complex Levenshtein distance
>>
>> Hello,        I am new to open source contribution.
>> Lately I gave a pull request to common-text. I dont know whether I am missing
>> any procedure to contribute to common-text. Kindly correct me so that I can do
>> the necessary so that someone will review and comment on my code.
>> Jira TEXT-10
>>
>> https://github.com/apache/commons-text/pull/6
>>
>> Kindly advise.
>> Thank you,
>> Regards,Don Jeba.
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [hidden email]
>For additional commands, e-mail: [hidden email]
>
>
>
>

>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



   
Reply | Threaded
Open this post in threaded view
|

Re: [TEXT] TEXT-10 A more complex Levenshtein distance

don jeba
Hi Bruno,The existing array with name "d", at any given instance contains data meant for only one row in the matrix.My understanding to get the number of inserts, deletes and substitutes, we need to form the complete matrix and iterate the diagonal elements to get the values.Considering this I cant use the existing array "d" to find the inserts, deletes and substitutes.Correct me if I am wrong in the above.
Do you want me to use a new 1d array (which contains the entire data from the matrix) instead of 2d array (1d vs 2d, improve in performance?)?
Kindly comment.
Do let me know if i am not clear in the above.
Thank you,
Regards,Don Jeba.

      From: don jeba <[hidden email]>
 To: Commons Developers List <[hidden email]>; Bruno P. Kinoshita <[hidden email]>
 Sent: Tuesday, 25 October 2016 9:15 PM
 Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
   
Hi Bruno,          Now the comment on formatting is fixed. I will be careful on this the next time.
My understanding is, we need to traverse from the diagonal element [right bottom corner] to find whether whether its insert or delete or substitute. I might be wrong.
Regarding using 1D array instead of 2D array, I think it should be possible. I will also give a try.
Thank you,
Regards,Don Jeba.

      From: Bruno P. Kinoshita <[hidden email]>
 To: Commons Developers List <[hidden email]>; don jeba <[hidden email]>
 Sent: Monday, 24 October 2016 8:15 AM
 Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
 
Hi Don,

Could you take a look at the tabs/spaces in the pull request, please?

In the meantime, I'll play with the code to see if

i) we can use the two-array algorithm instead; where we analyse two lines each time, instead of keeping the whole matrix. Will probably have to compute the LevenshteinResults on-the-fly for that, instead of in a separate method.

ii) check if it would be doable to use the one array + temporary variables algo instead, and also compute the insert+delete+substitute on the fly.

Just need a couple of hours to play with the code and run your tests to make sure it is working :)

Cheers
Bruno



>________________________________
> From: don jeba <[hidden email]>
>To: Commons Developers List <[hidden email]>; Bruno P. Kinoshita <[hidden email]>
>Sent: Monday, 17 October 2016 11:44 PM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>
>
>Hi Bruno,
>Thank you for the review.
>The one in lang gets only the distance (insert+delete+substitute). To get the individual values, (my understanding is), I need to iterate the diagonal elements in matrix, also need to know the elements at the left and top of the diagonal element to find whether its an insertion or deletion or substitution. Considering this I have used 2 dimensional array.
>Regards,
>Don Jeba.
>
>      From: Bruno P. Kinoshita <[hidden email]>
>
>To: Commons Developers List <[hidden email]>; don jeba <[hidden email]>
>Sent: Monday, 17 October 2016 7:37 AM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance

>Hi Don Jeba,
>
>I will have a look at your implementation to compare with a recent improvement in [lang]
>https://github.com/apache/commons-lang/blob/78134f6b3f1facd019e604d2cd000c4ce7cf9a0a/src/main/java/org/apache/commons/lang3/StringUtils.java#L7710
>
>Instead of keeping a matrix (or even only two rows) the current version in StringUtils keeps just one array and a couple of helper temporary variables.
>
>Not sure if we can re-use it, adding the new features in TEXT-10 (i.e. insert/delete/substitution counts), but if possible that'd be better.
>
>Cheers
>Bruno
>
>
>
>
>----- Original Message -----
>> From: don jeba <[hidden email]>
>> To: Commons Developers List <[hidden email]>
>> Sent: Monday, 17 October 2016 1:51 AM
>> Subject: [TEXT] TEXT-10 A more complex Levenshtein distance
>>
>> Hello,        I am new to open source contribution.
>> Lately I gave a pull request to common-text. I dont know whether I am missing
>> any procedure to contribute to common-text. Kindly correct me so that I can do
>> the necessary so that someone will review and comment on my code.
>> Jira TEXT-10
>>
>> https://github.com/apache/commons-text/pull/6
>>
>> Kindly advise.
>> Thank you,
>> Regards,Don Jeba.
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [hidden email]
>For additional commands, e-mail: [hidden email]
>
>
>
>

>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



   

   
Reply | Threaded
Open this post in threaded view
|

Re: [TEXT] TEXT-10 A more complex Levenshtein distance

Bruno P. Kinoshita-3
Hi Don,

I spent some time yesterday playing with the current code in [text], trying to count delete/insert/substitutions [1], but it doesn't seem to be very easy - if possible at all with just two arrays representing the last two rows, instead of the full matrix.

I see no reason for having your pull request pending then. Let's review and merge it, and we can think about trying any further optimisations later. I'm syncing the repo, and will take a look at the pull request right now.

Thanks a lot for your contribution, and for your patience :)
Bruno


[1] https://github.com/kinow/commons-text/tree/WIP-led2

>________________________________
> From: don jeba <[hidden email]>
>To: Commons Developers List <[hidden email]>; Bruno P. Kinoshita <[hidden email]>
>Sent: Monday, 7 November 2016 5:57 PM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>
>
>
>Hi Bruno,
>The existing array with name "d", at any given instance contains data meant for only one row in the matrix.
>My understanding to get the number of inserts, deletes and substitutes, we need to form the complete matrix and iterate the diagonal elements to get the values.
>Considering this I cant use the existing array "d" to find the inserts, deletes and substitutes.
>Correct me if I am wrong in the above.
>
>
>Do you want me to use a new 1d array (which contains the entire data from the matrix) instead of 2d array (1d vs 2d, improve in performance?)?
>
>
>Kindly comment.
>
>
>Do let me know if i am not clear in the above.
>
>
>Thank you,
>
>
>Regards,
>Don Jeba.
>
>
>
>
>________________________________
> From: don jeba <[hidden email]>
>To: Commons Developers List <[hidden email]>; Bruno P. Kinoshita <[hidden email]>
>Sent: Tuesday, 25 October 2016 9:15 PM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>
>
>
>Hi Bruno,
>          Now the comment on formatting is fixed. I will be careful on this the next time.
>
>
>My understanding is, we need to traverse from the diagonal element [right bottom corner] to find whether whether its insert or delete or substitute. I might be wrong.
>
>
>Regarding using 1D array instead of 2D array, I think it should be possible. I will also give a try.
>
>
>Thank you,
>
>
>Regards,
>Don Jeba.
>
>
>
>
>________________________________
> From: Bruno P. Kinoshita <[hidden email]>
>To: Commons Developers List <[hidden email]>; don jeba <[hidden email]>
>Sent: Monday, 24 October 2016 8:15 AM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>
>
>Hi Don,
>
>Could you take a look at the tabs/spaces in the pull request, please?
>
>In the meantime, I'll play with the code to see if
>
>i) we can use the two-array algorithm instead; where we analyse two lines each time, instead of keeping the whole matrix. Will probably have to compute the LevenshteinResults on-the-fly for that, instead of in a separate method.
>
>ii) check if it would be doable to use the one array + temporary variables algo instead, and also compute the insert+delete+substitute on the fly.
>
>Just need a couple of hours to play with the code and run your tests to make sure it is working :)
>
>Cheers
>Bruno
>
>
>
>>________________________________
>> From: don jeba <[hidden email]>
>>To: Commons Developers List <[hidden email]>; Bruno P. Kinoshita <[hidden email]>
>>Sent: Monday, 17 October 2016 11:44 PM
>>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>>
>>
>>Hi Bruno,
>>Thank you for the review.
>>The one in lang gets only the distance (insert+delete+substitute). To get the individual values, (my understanding is), I need to iterate the diagonal elements in matrix, also need to know the elements at the left and top of the diagonal element to find whether its an insertion or deletion or substitution. Considering this I have used 2 dimensional array.
>>Regards,
>>Don Jeba.
>>
>>      From: Bruno P. Kinoshita <[hidden email]>
>>
>>To: Commons Developers List <[hidden email]>; don jeba <[hidden email]>
>>Sent: Monday, 17 October 2016 7:37 AM
>>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>>  
>>Hi Don Jeba,
>>
>>I will have a look at your implementation to compare with a recent improvement in [lang]
>>https://github.com/apache/commons-lang/blob/78134f6b3f1facd019e604d2cd000c4ce7cf9a0a/src/main/java/org/apache/commons/lang3/StringUtils.java#L7710
>>
>>Instead of keeping a matrix (or even only two rows) the current version in StringUtils keeps just one array and a couple of helper temporary variables.
>>
>>Not sure if we can re-use it, adding the new features in TEXT-10 (i.e. insert/delete/substitution counts), but if possible that'd be better.
>>
>>Cheers
>>Bruno
>>
>>
>>
>>
>>----- Original Message -----
>>> From: don jeba <[hidden email]>
>>> To: Commons Developers List <[hidden email]>
>>> Sent: Monday, 17 October 2016 1:51 AM
>>> Subject: [TEXT] TEXT-10 A more complex Levenshtein distance
>>>
>>> Hello,        I am new to open source contribution.
>>> Lately I gave a pull request to common-text. I dont know whether I am missing
>>> any procedure to contribute to common-text. Kindly correct me so that I can do
>>> the necessary so that someone will review and comment on my code.
>>> Jira TEXT-10
>>>
>>> https://github.com/apache/commons-text/pull/6
>>>
>>> Kindly advise.
>>> Thank you,
>>> Regards,Don Jeba.
>>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: [hidden email]
>>For additional commands, e-mail: [hidden email]
>
>>
>>
>>
>>
>>  
>>
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [hidden email]
>For additional commands, e-mail: [hidden email]
>
>
>
>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [TEXT] TEXT-10 A more complex Levenshtein distance

don jeba
Thank you Bruno for merging the changes.
The logic which i knew to find the individual count is by traversing the matrix diagonally from the bottom right corner. If there is some other way, do let me know the reference, so that I can check that further.
Thank you,
Regards,Don Jeba.

      From: Bruno P. Kinoshita <[hidden email]>
 To: don jeba <[hidden email]>; Commons Developers List <[hidden email]>
 Sent: Tuesday, 8 November 2016 12:52 PM
 Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
   
Hi Don,

I spent some time yesterday playing with the current code in [text], trying to count delete/insert/substitutions [1], but it doesn't seem to be very easy - if possible at all with just two arrays representing the last two rows, instead of the full matrix.

I see no reason for having your pull request pending then. Let's review and merge it, and we can think about trying any further optimisations later. I'm syncing the repo, and will take a look at the pull request right now.

Thanks a lot for your contribution, and for your patience :)
Bruno


[1] https://github.com/kinow/commons-text/tree/WIP-led2

>________________________________
> From: don jeba <[hidden email]>
>To: Commons Developers List <[hidden email]>; Bruno P. Kinoshita <[hidden email]>
>Sent: Monday, 7 November 2016 5:57 PM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>
>
>
>Hi Bruno,
>The existing array with name "d", at any given instance contains data meant for only one row in the matrix.
>My understanding to get the number of inserts, deletes and substitutes, we need to form the complete matrix and iterate the diagonal elements to get the values.
>Considering this I cant use the existing array "d" to find the inserts, deletes and substitutes.
>Correct me if I am wrong in the above.
>
>
>Do you want me to use a new 1d array (which contains the entire data from the matrix) instead of 2d array (1d vs 2d, improve in performance?)?
>
>
>Kindly comment.
>
>
>Do let me know if i am not clear in the above.
>
>
>Thank you,
>
>
>Regards,
>Don Jeba.
>
>
>
>
>________________________________
> From: don jeba <[hidden email]>
>To: Commons Developers List <[hidden email]>; Bruno P. Kinoshita <[hidden email]>
>Sent: Tuesday, 25 October 2016 9:15 PM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>
>
>
>Hi Bruno,
>          Now the comment on formatting is fixed. I will be careful on this the next time.
>
>
>My understanding is, we need to traverse from the diagonal element [right bottom corner] to find whether whether its insert or delete or substitute. I might be wrong.
>
>
>Regarding using 1D array instead of 2D array, I think it should be possible. I will also give a try.
>
>
>Thank you,
>
>
>Regards,
>Don Jeba.
>
>
>
>
>________________________________
> From: Bruno P. Kinoshita <[hidden email]>
>To: Commons Developers List <[hidden email]>; don jeba <[hidden email]>
>Sent: Monday, 24 October 2016 8:15 AM
>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>
>
>Hi Don,
>
>Could you take a look at the tabs/spaces in the pull request, please?
>
>In the meantime, I'll play with the code to see if
>
>i) we can use the two-array algorithm instead; where we analyse two lines each time, instead of keeping the whole matrix. Will probably have to compute the LevenshteinResults on-the-fly for that, instead of in a separate method.
>
>ii) check if it would be doable to use the one array + temporary variables algo instead, and also compute the insert+delete+substitute on the fly.
>
>Just need a couple of hours to play with the code and run your tests to make sure it is working :)
>
>Cheers
>Bruno
>
>
>
>>________________________________
>> From: don jeba <[hidden email]>
>>To: Commons Developers List <[hidden email]>; Bruno P. Kinoshita <[hidden email]>
>>Sent: Monday, 17 October 2016 11:44 PM
>>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>>
>>
>>Hi Bruno,
>>Thank you for the review.
>>The one in lang gets only the distance (insert+delete+substitute). To get the individual values, (my understanding is), I need to iterate the diagonal elements in matrix, also need to know the elements at the left and top of the diagonal element to find whether its an insertion or deletion or substitution. Considering this I have used 2 dimensional array.
>>Regards,
>>Don Jeba.
>>
>>      From: Bruno P. Kinoshita <[hidden email]>
>>
>>To: Commons Developers List <[hidden email]>; don jeba <[hidden email]>
>>Sent: Monday, 17 October 2016 7:37 AM
>>Subject: Re: [TEXT] TEXT-10 A more complex Levenshtein distance
>> 
>>Hi Don Jeba,
>>
>>I will have a look at your implementation to compare with a recent improvement in [lang]
>>https://github.com/apache/commons-lang/blob/78134f6b3f1facd019e604d2cd000c4ce7cf9a0a/src/main/java/org/apache/commons/lang3/StringUtils.java#L7710
>>
>>Instead of keeping a matrix (or even only two rows) the current version in StringUtils keeps just one array and a couple of helper temporary variables.
>>
>>Not sure if we can re-use it, adding the new features in TEXT-10 (i.e. insert/delete/substitution counts), but if possible that'd be better.
>>
>>Cheers
>>Bruno
>>
>>
>>
>>
>>----- Original Message -----
>>> From: don jeba <[hidden email]>
>>> To: Commons Developers List <[hidden email]>
>>> Sent: Monday, 17 October 2016 1:51 AM
>>> Subject: [TEXT] TEXT-10 A more complex Levenshtein distance
>>>
>>> Hello,        I am new to open source contribution.
>>> Lately I gave a pull request to common-text. I dont know whether I am missing
>>> any procedure to contribute to common-text. Kindly correct me so that I can do
>>> the necessary so that someone will review and comment on my code.
>>> Jira TEXT-10
>>>
>>> https://github.com/apache/commons-text/pull/6
>>>
>>> Kindly advise.
>>> Thank you,
>>> Regards,Don Jeba.
>>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: [hidden email]
>>For additional commands, e-mail: [hidden email]
>
>>
>>
>>
>>
>> 
>>
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [hidden email]
>For additional commands, e-mail: [hidden email]
>
>
>
>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]