[text] TEXT-104 clirr errors, prepare 2.0 or revert change

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[text] TEXT-104 clirr errors, prepare 2.0 or revert change

Bruno P. Kinoshita-2
Hi all,
Just finished merging a pull request to TEXT-104, where the JaroWinkler distance was updated. The class was actually computing a text similarity score, not an edit distance. The user that contributed did a great job moving the logic into a separate class, then updating the method to return a distance instead.
Later I realized this would break both behaviour and binary compatibility.
So just wondering what others think. Is it time to gather a few more issues in text, maybe even consider updating libraries/java/etc, drop @Deprecated stuff, and prepare a 2.0? Or is it too soon, and instead revert moving the code to a branch, and update TEXT-104 with a note about the branch?
CheersBruno
Reply | Threaded
Open this post in threaded view
|

Re: [text] TEXT-104 clirr errors, prepare 2.0 or revert change

Benedikt Ritter-4
Am Mi., 20. Feb. 2019 um 08:58 Uhr schrieb Bruno P. Kinoshita <
[hidden email]>:

> Hi all,
> Just finished merging a pull request to TEXT-104, where the JaroWinkler
> distance was updated. The class was actually computing a text similarity
> score, not an edit distance. The user that contributed did a great job
> moving the logic into a separate class, then updating the method to return
> a distance instead.
> Later I realized this would break both behaviour and binary compatibility.
> So just wondering what others think. Is it time to gather a few more
> issues in text, maybe even consider updating libraries/java/etc, drop
> @Deprecated stuff, and prepare a 2.0? Or is it too soon, and instead revert
> moving the code to a branch, and update TEXT-104 with a note about the
> branch?
>

This would be a bad signal to the contributor. Do you think it's possible
to have both solutions side by side? So we keep the old class with the name
an interface, deprecate it and put the new solution in the same package
with a different class name?

Benedikt


> CheersBruno
>
Reply | Threaded
Open this post in threaded view
|

Re: [text] TEXT-104 clirr errors, prepare 2.0 or revert change

Rob Tompkins


> On Feb 20, 2019, at 5:42 AM, Benedikt Ritter <[hidden email]> wrote:
>
> Am Mi., 20. Feb. 2019 um 08:58 Uhr schrieb Bruno P. Kinoshita <
> [hidden email]>:
>
>> Hi all,
>> Just finished merging a pull request to TEXT-104, where the JaroWinkler
>> distance was updated. The class was actually computing a text similarity
>> score, not an edit distance. The user that contributed did a great job
>> moving the logic into a separate class, then updating the method to return
>> a distance instead.
>> Later I realized this would break both behaviour and binary compatibility.
>> So just wondering what others think. Is it time to gather a few more
>> issues in text, maybe even consider updating libraries/java/etc, drop
>> @Deprecated stuff, and prepare a 2.0? Or is it too soon, and instead revert
>> moving the code to a branch, and update TEXT-104 with a note about the
>> branch?
>>
>
> This would be a bad signal to the contributor. Do you think it's possible
> to have both solutions side by side? So we keep the old class with the name
> an interface, deprecate it and put the new solution in the same package
> with a different class name?

I like this idea. What if you added an up-versioned package, running v2 and v1 side by side? Maybe too confusing. You could add v2 to the class name. Also maybe a bad idea.

Just some thoughts that ran through my head here.

-Rob

>
> Benedikt
>
>
>> CheersBruno
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [text] TEXT-104 clirr errors, prepare 2.0 or revert change

Pascal Schumacher
In reply to this post by Bruno P. Kinoshita-2
I'm fine with either solution, but my preference would be to remove all
deprecated stuff and release version 2.0.

Am 20.02.2019 um 08:42 schrieb Bruno P. Kinoshita:
> Hi all,
> Just finished merging a pull request to TEXT-104, where the JaroWinkler distance was updated. The class was actually computing a text similarity score, not an edit distance. The user that contributed did a great job moving the logic into a separate class, then updating the method to return a distance instead.
> Later I realized this would break both behaviour and binary compatibility.
> So just wondering what others think. Is it time to gather a few more issues in text, maybe even consider updating libraries/java/etc, drop @Deprecated stuff, and prepare a 2.0? Or is it too soon, and instead revert moving the code to a branch, and update TEXT-104 with a note about the branch?
> CheersBruno
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [text] TEXT-104 clirr errors, prepare 2.0 or revert change

garydgregory
Are we really ready for a 2.0? How much deprecated stuff do we carry?

I plan on taking a closer look at the jarod distance issue tonight or
tomorrow.

Gary

On Wed, Feb 20, 2019, 13:33 Pascal Schumacher <[hidden email]
wrote:

> I'm fine with either solution, but my preference would be to remove all
> deprecated stuff and release version 2.0.
>
> Am 20.02.2019 um 08:42 schrieb Bruno P. Kinoshita:
> > Hi all,
> > Just finished merging a pull request to TEXT-104, where the JaroWinkler
> distance was updated. The class was actually computing a text similarity
> score, not an edit distance. The user that contributed did a great job
> moving the logic into a separate class, then updating the method to return
> a distance instead.
> > Later I realized this would break both behaviour and binary
> compatibility.
> > So just wondering what others think. Is it time to gather a few more
> issues in text, maybe even consider updating libraries/java/etc, drop
> @Deprecated stuff, and prepare a 2.0? Or is it too soon, and instead revert
> moving the code to a branch, and update TEXT-104 with a note about the
> branch?
> > CheersBruno
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [text] TEXT-104 clirr errors, prepare 2.0 or revert change

Bruno P. Kinoshita-2

We have a few things ported from Lang that are deprecated and could be removed.


But I have reverted my change in this pull request:


https://github.com/apache/commons-text/pull/102


It introduces back the constant and the method removed, and also uses the old code for the edit distance. But the contributed new code is still there (i.e. I did not remove JaroWinklerSimilarity).


This was suggested by another user in the pull request for TEXT-104, and I believe Benedikt and Rob also suggested something similar.


So if there are no objections I will merge it later this tonight or tomorrow, and create a ticket in JIRA for 2.0 to replace the code, and fix the TODO tags.


This way we can leave 2.0 for later, and possibly discuss other major changes like Java modules, changes for Java 11, etc.


How does that sound?


Bruno






On Thursday, 21 February 2019, 10:50:36 am NZDT, Gary Gregory <[hidden email]> wrote:





Are we really ready for a 2.0? How much deprecated stuff do we carry?

I plan on taking a closer look at the jarod distance issue tonight or
tomorrow.

Gary

On Wed, Feb 20, 2019, 13:33 Pascal Schumacher <[hidden email]
wrote:

> I'm fine with either solution, but my preference would be to remove all
> deprecated stuff and release version 2.0.
>
> Am 20.02.2019 um 08:42 schrieb Bruno P. Kinoshita:
> > Hi all,
> > Just finished merging a pull request to TEXT-104, where the JaroWinkler
> distance was updated. The class was actually computing a text similarity
> score, not an edit distance. The user that contributed did a great job
> moving the logic into a separate class, then updating the method to return
> a distance instead.
> > Later I realized this would break both behaviour and binary
> compatibility.
> > So just wondering what others think. Is it time to gather a few more
> issues in text, maybe even consider updating libraries/java/etc, drop
> @Deprecated stuff, and prepare a 2.0? Or is it too soon, and instead revert
> moving the code to a branch, and update TEXT-104 with a note about the
> branch?
> > CheersBruno
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [text] TEXT-104 clirr errors, prepare 2.0 or revert change

Rob Tompkins
Sounds reasonable. But I suppose the question we should ask ourselves is: do we want a 1.7 or a 2.0? I’d be happy with either.

-Rob

> On Feb 20, 2019, at 4:56 PM, Bruno P. Kinoshita <[hidden email]> wrote:
>
>
> We have a few things ported from Lang that are deprecated and could be removed.
>
>
> But I have reverted my change in this pull request:
>
>
> https://github.com/apache/commons-text/pull/102
>
>
> It introduces back the constant and the method removed, and also uses the old code for the edit distance. But the contributed new code is still there (i.e. I did not remove JaroWinklerSimilarity).
>
>
> This was suggested by another user in the pull request for TEXT-104, and I believe Benedikt and Rob also suggested something similar.
>
>
> So if there are no objections I will merge it later this tonight or tomorrow, and create a ticket in JIRA for 2.0 to replace the code, and fix the TODO tags.
>
>
> This way we can leave 2.0 for later, and possibly discuss other major changes like Java modules, changes for Java 11, etc.
>
>
> How does that sound?
>
>
> Bruno
>
>
>
>
>
>
> On Thursday, 21 February 2019, 10:50:36 am NZDT, Gary Gregory <[hidden email]> wrote:
>
>
>
>
>
> Are we really ready for a 2.0? How much deprecated stuff do we carry?
>
> I plan on taking a closer look at the jarod distance issue tonight or
> tomorrow.
>
> Gary
>
> On Wed, Feb 20, 2019, 13:33 Pascal Schumacher <[hidden email]
> wrote:
>
>> I'm fine with either solution, but my preference would be to remove all
>> deprecated stuff and release version 2.0.
>>
>>> Am 20.02.2019 um 08:42 schrieb Bruno P. Kinoshita:
>>> Hi all,
>>> Just finished merging a pull request to TEXT-104, where the JaroWinkler
>> distance was updated. The class was actually computing a text similarity
>> score, not an edit distance. The user that contributed did a great job
>> moving the logic into a separate class, then updating the method to return
>> a distance instead.
>>> Later I realized this would break both behaviour and binary
>> compatibility.
>>> So just wondering what others think. Is it time to gather a few more
>> issues in text, maybe even consider updating libraries/java/etc, drop
>> @Deprecated stuff, and prepare a 2.0? Or is it too soon, and instead revert
>> moving the code to a branch, and update TEXT-104 with a note about the
>> branch?
>>> CheersBruno
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [text] TEXT-104 clirr errors, prepare 2.0 or revert change

Bruno P. Kinoshita-3
 Same for me. Just provided a solution to unblock 1.7, but happy to go with a 2.0 if we others agree too.
I haven't followed much around the Java modules. But this is a good opportunity to fix anything required for the new Java versions.
CheersBruno

    On Thursday, 21 February 2019, 10:59:11 am NZDT, Rob Tompkins <[hidden email]> wrote:  
 
 Sounds reasonable. But I suppose the question we should ask ourselves is: do we want a 1.7 or a 2.0? I’d be happy with either.

-Rob

> On Feb 20, 2019, at 4:56 PM, Bruno P. Kinoshita <[hidden email]> wrote:
>
>
> We have a few things ported from Lang that are deprecated and could be removed.
>
>
> But I have reverted my change in this pull request:
>
>
> https://github.com/apache/commons-text/pull/102
>
>
> It introduces back the constant and the method removed, and also uses the old code for the edit distance. But the contributed new code is still there (i.e. I did not remove JaroWinklerSimilarity).
>
>
> This was suggested by another user in the pull request for TEXT-104, and I believe Benedikt and Rob also suggested something similar.
>
>
> So if there are no objections I will merge it later this tonight or tomorrow, and create a ticket in JIRA for 2.0 to replace the code, and fix the TODO tags.
>
>
> This way we can leave 2.0 for later, and possibly discuss other major changes like Java modules, changes for Java 11, etc.
>
>
> How does that sound?
>
>
> Bruno
>
>
>
>
>
>
> On Thursday, 21 February 2019, 10:50:36 am NZDT, Gary Gregory <[hidden email]> wrote:
>
>
>
>
>
> Are we really ready for a 2.0? How much deprecated stuff do we carry?
>
> I plan on taking a closer look at the jarod distance issue tonight or
> tomorrow.
>
> Gary
>
> On Wed, Feb 20, 2019, 13:33 Pascal Schumacher <[hidden email]
> wrote:
>
>> I'm fine with either solution, but my preference would be to remove all
>> deprecated stuff and release version 2.0.
>>
>>> Am 20.02.2019 um 08:42 schrieb Bruno P. Kinoshita:
>>> Hi all,
>>> Just finished merging a pull request to TEXT-104, where the JaroWinkler
>> distance was updated. The class was actually computing a text similarity
>> score, not an edit distance. The user that contributed did a great job
>> moving the logic into a separate class, then updating the method to return
>> a distance instead.
>>> Later I realized this would break both behaviour and binary
>> compatibility.
>>> So just wondering what others think. Is it time to gather a few more
>> issues in text, maybe even consider updating libraries/java/etc, drop
>> @Deprecated stuff, and prepare a 2.0? Or is it too soon, and instead revert
>> moving the code to a branch, and update TEXT-104 with a note about the
>> branch?
>>> CheersBruno
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
 
Reply | Threaded
Open this post in threaded view
|

Re: [text] TEXT-104 clirr errors, prepare 2.0 or revert change

Bruno P. Kinoshita-2
In reply to this post by Benedikt Ritter-4


Good idea. Another user commented something similar in the pull request, and I believe Rob's suggestion was in the same direction.

Here's a PR that fixes clirr and deprecates a few things for 2.0: https://github.com/apache/commons-text/pull/102


Thanks!
Bruno



On Wednesday, 20 February 2019, 11:42:35 pm NZDT, Benedikt Ritter <[hidden email]> wrote:





Am Mi., 20. Feb. 2019 um 08:58 Uhr schrieb Bruno P. Kinoshita <
[hidden email]>:

> Hi all,
> Just finished merging a pull request to TEXT-104, where the JaroWinkler
> distance was updated. The class was actually computing a text similarity
> score, not an edit distance. The user that contributed did a great job
> moving the logic into a separate class, then updating the method to return
> a distance instead.
> Later I realized this would break both behaviour and binary compatibility.
> So just wondering what others think. Is it time to gather a few more
> issues in text, maybe even consider updating libraries/java/etc, drop
> @Deprecated stuff, and prepare a 2.0? Or is it too soon, and instead revert
> moving the code to a branch, and update TEXT-104 with a note about the
> branch?
>

This would be a bad signal to the contributor. Do you think it's possible
to have both solutions side by side? So we keep the old class with the name
an interface, deprecate it and put the new solution in the same package
with a different class name?

Benedikt



> CheersBruno
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]