[text] Next steps.

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[text] Next steps.

Rob Tompkins
Hello,

I'm a tad curious what folks (along with Gary, Benedikt, and Bruno) think
the next steps are for text in the hopeful thought that we are eventually
heading towards a 1.0 release. Some thoughts that come to mind are:

(1) Go over lang with fine tooth comb and see what we think should move,
(2) Go through the Lothaire "Applied Combinatorics on Words" book (
http://lipn.univ-paris13.fr/~duchamp/Books&more/Lothaire/(Encyclopedia_of_Mathematics_and_its_Applications_)M._Lothaire-Applied_Combinatorics_On_Words-Cambridge_University_Press(2005).pdf)
and minimally implement some of the standard algorithms.
(3) Implement, from the Lothaire book, some of the more complex stuff:
heavier pattern matching, and/or natural language processing,
and/or
(4) Go straight for a release.

I'm less for (4) because I think there's probably some smaller bits of code
in lang that probably come over. I like the idea of (2) before heading out
the door. Regarding (3), I would have to do considerable reading to make
considerable headway here, which I'm not opposed to doing it would just
merely prolong getting to a 1.0 release if we predicated the release upon
my getting that done.

So, what do you guys think?

Cheers,
-Rob
Reply | Threaded
Open this post in threaded view
|

Re: [text] Next steps.

Bruno P. Kinoshita-3
Hi Rob,

First of all, kudos for the great work moving things from [lang] into [text].

I got a copy of the Lothaire book last weekend, but haven't had a chance to read it yet.

There was also some discussion around the name-parser, and since we couldn't reach a consensus,
I think we could either try to have another discussion thread, or stash it somewhere so that
it doesn't block a release.


I also would like to implement more edit distance and string similarities, as well as
look into the duration unit parser, probably adapting code from github.com/jchampemont/gunip


But I'd vote for (4). First moving the human name parser elsewhere, reviewing the edit distances,
and checking if there's anything else we could put into this initial release from [lang].

Once it has been released, we will be able to add things from Lothaire book,
more edit distances, maybe bring back the name parser, as well as any enhancement
bug fixing.

Bruno

>________________________________
> From: Rob Tompkins <[hidden email]>
>To: Commons Developers List <[hidden email]>
>Sent: Tuesday, 29 November 2016 11:45 AM
>Subject: [text] Next steps.
>
>
>Hello,
>
>I'm a tad curious what folks (along with Gary, Benedikt, and Bruno) think
>the next steps are for text in the hopeful thought that we are eventually
>heading towards a 1.0 release. Some thoughts that come to mind are:
>
>(1) Go over lang with fine tooth comb and see what we think should move,
>(2) Go through the Lothaire "Applied Combinatorics on Words" book (
>http://lipn.univ-paris13.fr/~duchamp/Books&more/Lothaire/(Encyclopedia_of_Mathematics_and_its_Applications_)M._Lothaire-Applied_Combinatorics_On_Words-Cambridge_University_Press(2005).pdf)
>and minimally implement some of the standard algorithms.
>(3) Implement, from the Lothaire book, some of the more complex stuff:
>heavier pattern matching, and/or natural language processing,
>and/or
>(4) Go straight for a release.
>
>I'm less for (4) because I think there's probably some smaller bits of code
>in lang that probably come over. I like the idea of (2) before heading out
>the door. Regarding (3), I would have to do considerable reading to make
>considerable headway here, which I'm not opposed to doing it would just
>merely prolong getting to a 1.0 release if we predicated the release upon
>my getting that done.
>
>So, what do you guys think?
>
>Cheers,
>-Rob
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [text] Next steps.

garydgregory
In reply to this post by Rob Tompkins
If it were just me, I would:

- Port all relevant (with appropriate agreements from the community) code
from [lang] to [text].
- Release [text] 1.0
- Deprecate code in [lang]
- Release [lang] 3.6.
- Continue [text] on to 1.1 with NEW functionality.

IOW, I do not see the point of delaying [text] 1.0 and [lang] 3.6. Once we
establish a clean [text] 1.0, we can move on to adding bells and whistle.

For me, at least initially, the "epic" is to declutter and refocus [lang].

Gary

On Mon, Nov 28, 2016 at 2:44 PM, Rob Tompkins <[hidden email]> wrote:

> Hello,
>
> I'm a tad curious what folks (along with Gary, Benedikt, and Bruno) think
> the next steps are for text in the hopeful thought that we are eventually
> heading towards a 1.0 release. Some thoughts that come to mind are:
>
> (1) Go over lang with fine tooth comb and see what we think should move,
> (2) Go through the Lothaire "Applied Combinatorics on Words" book (
> http://lipn.univ-paris13.fr/~duchamp/Books&more/Lothaire/(
> Encyclopedia_of_Mathematics_and_its_Applications_)M._Lothaire-Applied_
> Combinatorics_On_Words-Cambridge_University_Press(2005).pdf)
> and minimally implement some of the standard algorithms.
> (3) Implement, from the Lothaire book, some of the more complex stuff:
> heavier pattern matching, and/or natural language processing,
> and/or
> (4) Go straight for a release.
>
> I'm less for (4) because I think there's probably some smaller bits of code
> in lang that probably come over. I like the idea of (2) before heading out
> the door. Regarding (3), I would have to do considerable reading to make
> considerable headway here, which I'm not opposed to doing it would just
> merely prolong getting to a 1.0 release if we predicated the release upon
> my getting that done.
>
> So, what do you guys think?
>
> Cheers,
> -Rob
>



--
E-Mail: [hidden email] | [hidden email]
Java Persistence with Hibernate, Second Edition
<https://www.amazon.com/gp/product/1617290459/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1617290459&linkCode=as2&tag=garygregory-20&linkId=cadb800f39946ec62ea2b1af9fe6a2b8>

<http:////ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1617290459>
JUnit in Action, Second Edition
<https://www.amazon.com/gp/product/1935182021/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1935182021&linkCode=as2&tag=garygregory-20&linkId=31ecd1f6b6d1eaf8886ac902a24de418%22>

<http:////ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1935182021>
Spring Batch in Action
<https://www.amazon.com/gp/product/1935182951/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1935182951&linkCode=%7B%7BlinkCode%7D%7D&tag=garygregory-20&linkId=%7B%7Blink_id%7D%7D%22%3ESpring+Batch+in+Action>
<http:////ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1935182951>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory
Reply | Threaded
Open this post in threaded view
|

Re: [text] Next steps.

garydgregory
In reply to this post by Bruno P. Kinoshita-3
+1 to removing the human name code for 1.0. Put it in a branch IMO.

Gary

On Mon, Nov 28, 2016 at 3:12 PM, Bruno P. Kinoshita <
[hidden email]> wrote:

> Hi Rob,
>
> First of all, kudos for the great work moving things from [lang] into
> [text].
>
> I got a copy of the Lothaire book last weekend, but haven't had a chance
> to read it yet.
>
> There was also some discussion around the name-parser, and since we
> couldn't reach a consensus,
> I think we could either try to have another discussion thread, or stash it
> somewhere so that
> it doesn't block a release.
>
>
> I also would like to implement more edit distance and string similarities,
> as well as
> look into the duration unit parser, probably adapting code from
> github.com/jchampemont/gunip
>
>
> But I'd vote for (4). First moving the human name parser elsewhere,
> reviewing the edit distances,
> and checking if there's anything else we could put into this initial
> release from [lang].
>
> Once it has been released, we will be able to add things from Lothaire
> book,
> more edit distances, maybe bring back the name parser, as well as any
> enhancement
> bug fixing.
>
> Bruno
>
> >________________________________
> > From: Rob Tompkins <[hidden email]>
> >To: Commons Developers List <[hidden email]>
> >Sent: Tuesday, 29 November 2016 11:45 AM
> >Subject: [text] Next steps.
> >
> >
> >Hello,
> >
> >I'm a tad curious what folks (along with Gary, Benedikt, and Bruno) think
> >the next steps are for text in the hopeful thought that we are eventually
> >heading towards a 1.0 release. Some thoughts that come to mind are:
> >
> >(1) Go over lang with fine tooth comb and see what we think should move,
> >(2) Go through the Lothaire "Applied Combinatorics on Words" book (
> >http://lipn.univ-paris13.fr/~duchamp/Books&more/Lothaire/(
> Encyclopedia_of_Mathematics_and_its_Applications_)M._Lothaire-Applied_
> Combinatorics_On_Words-Cambridge_University_Press(2005).pdf)
> >and minimally implement some of the standard algorithms.
> >(3) Implement, from the Lothaire book, some of the more complex stuff:
> >heavier pattern matching, and/or natural language processing,
> >and/or
> >(4) Go straight for a release.
> >
> >I'm less for (4) because I think there's probably some smaller bits of
> code
> >in lang that probably come over. I like the idea of (2) before heading out
> >the door. Regarding (3), I would have to do considerable reading to make
> >considerable headway here, which I'm not opposed to doing it would just
> >merely prolong getting to a 1.0 release if we predicated the release upon
> >my getting that done.
> >
> >So, what do you guys think?
> >
> >Cheers,
> >-Rob
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
E-Mail: [hidden email] | [hidden email]
Java Persistence with Hibernate, Second Edition
<https://www.amazon.com/gp/product/1617290459/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1617290459&linkCode=as2&tag=garygregory-20&linkId=cadb800f39946ec62ea2b1af9fe6a2b8>

<http:////ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1617290459>
JUnit in Action, Second Edition
<https://www.amazon.com/gp/product/1935182021/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1935182021&linkCode=as2&tag=garygregory-20&linkId=31ecd1f6b6d1eaf8886ac902a24de418%22>

<http:////ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1935182021>
Spring Batch in Action
<https://www.amazon.com/gp/product/1935182951/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1935182951&linkCode=%7B%7BlinkCode%7D%7D&tag=garygregory-20&linkId=%7B%7Blink_id%7D%7D%22%3ESpring+Batch+in+Action>
<http:////ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1935182951>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory
Reply | Threaded
Open this post in threaded view
|

Re: [text] Next steps.

Benedikt Ritter-4
+1 for an early release without the name parser. We can implement the
algorithms from the book in 1.1.

Benedikt

Gary Gregory <[hidden email]> schrieb am Di. 29. Nov. 2016 um 00:34:

> +1 to removing the human name code for 1.0. Put it in a branch IMO.
>
> Gary
>
> On Mon, Nov 28, 2016 at 3:12 PM, Bruno P. Kinoshita <
> [hidden email]> wrote:
>
> > Hi Rob,
> >
> > First of all, kudos for the great work moving things from [lang] into
> > [text].
> >
> > I got a copy of the Lothaire book last weekend, but haven't had a chance
> > to read it yet.
> >
> > There was also some discussion around the name-parser, and since we
> > couldn't reach a consensus,
> > I think we could either try to have another discussion thread, or stash
> it
> > somewhere so that
> > it doesn't block a release.
> >
> >
> > I also would like to implement more edit distance and string
> similarities,
> > as well as
> > look into the duration unit parser, probably adapting code from
> > github.com/jchampemont/gunip
> >
> >
> > But I'd vote for (4). First moving the human name parser elsewhere,
> > reviewing the edit distances,
> > and checking if there's anything else we could put into this initial
> > release from [lang].
> >
> > Once it has been released, we will be able to add things from Lothaire
> > book,
> > more edit distances, maybe bring back the name parser, as well as any
> > enhancement
> > bug fixing.
> >
> > Bruno
> >
> > >________________________________
> > > From: Rob Tompkins <[hidden email]>
> > >To: Commons Developers List <[hidden email]>
> > >Sent: Tuesday, 29 November 2016 11:45 AM
> > >Subject: [text] Next steps.
> > >
> > >
> > >Hello,
> > >
> > >I'm a tad curious what folks (along with Gary, Benedikt, and Bruno)
> think
> > >the next steps are for text in the hopeful thought that we are
> eventually
> > >heading towards a 1.0 release. Some thoughts that come to mind are:
> > >
> > >(1) Go over lang with fine tooth comb and see what we think should move,
> > >(2) Go through the Lothaire "Applied Combinatorics on Words" book (
> > >http://lipn.univ-paris13.fr/~duchamp/Books&more/Lothaire/(
> > Encyclopedia_of_Mathematics_and_its_Applications_)M._Lothaire-Applied_
> > Combinatorics_On_Words-Cambridge_University_Press(2005).pdf)
> > >and minimally implement some of the standard algorithms.
> > >(3) Implement, from the Lothaire book, some of the more complex stuff:
> > >heavier pattern matching, and/or natural language processing,
> > >and/or
> > >(4) Go straight for a release.
> > >
> > >I'm less for (4) because I think there's probably some smaller bits of
> > code
> > >in lang that probably come over. I like the idea of (2) before heading
> out
> > >the door. Regarding (3), I would have to do considerable reading to make
> > >considerable headway here, which I'm not opposed to doing it would just
> > >merely prolong getting to a 1.0 release if we predicated the release
> upon
> > >my getting that done.
> > >
> > >So, what do you guys think?
> > >
> > >Cheers,
> > >-Rob
> > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
>
> --
> E-Mail: [hidden email] | [hidden email]
> Java Persistence with Hibernate, Second Edition
> <
> https://www.amazon.com/gp/product/1617290459/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1617290459&linkCode=as2&tag=garygregory-20&linkId=cadb800f39946ec62ea2b1af9fe6a2b8
> >
>
> <http:////
> ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1617290459>
> JUnit in Action, Second Edition
> <
> https://www.amazon.com/gp/product/1935182021/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1935182021&linkCode=as2&tag=garygregory-20&linkId=31ecd1f6b6d1eaf8886ac902a24de418%22
> >
>
> <http:////
> ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1935182021>
> Spring Batch in Action
> <
> https://www.amazon.com/gp/product/1935182951/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1935182951&linkCode=%7B%7BlinkCode%7D%7D&tag=garygregory-20&linkId=%7B%7Blink_id%7D%7D%22%3ESpring+Batch+in+Action
> >
> <http:////
> ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1935182951>
> Blog: http://garygregory.wordpress.com
> Home: http://garygregory.com/
> Tweet! http://twitter.com/GaryGregory
>
Reply | Threaded
Open this post in threaded view
|

Re: [text] Next steps.

Rob Tompkins

> On Nov 29, 2016, at 2:02 PM, Benedikt Ritter <[hidden email]> wrote:
>
> +1 for an early release without the name parser. We can implement the
> algorithms from the book in 1.1.

Sounds good I’ll start heading that direction.

-Rob

>
> Benedikt
>
> Gary Gregory <[hidden email]> schrieb am Di. 29. Nov. 2016 um 00:34:
>
>> +1 to removing the human name code for 1.0. Put it in a branch IMO.
>>
>> Gary
>>
>> On Mon, Nov 28, 2016 at 3:12 PM, Bruno P. Kinoshita <
>> [hidden email]> wrote:
>>
>>> Hi Rob,
>>>
>>> First of all, kudos for the great work moving things from [lang] into
>>> [text].
>>>
>>> I got a copy of the Lothaire book last weekend, but haven't had a chance
>>> to read it yet.
>>>
>>> There was also some discussion around the name-parser, and since we
>>> couldn't reach a consensus,
>>> I think we could either try to have another discussion thread, or stash
>> it
>>> somewhere so that
>>> it doesn't block a release.
>>>
>>>
>>> I also would like to implement more edit distance and string
>> similarities,
>>> as well as
>>> look into the duration unit parser, probably adapting code from
>>> github.com/jchampemont/gunip
>>>
>>>
>>> But I'd vote for (4). First moving the human name parser elsewhere,
>>> reviewing the edit distances,
>>> and checking if there's anything else we could put into this initial
>>> release from [lang].
>>>
>>> Once it has been released, we will be able to add things from Lothaire
>>> book,
>>> more edit distances, maybe bring back the name parser, as well as any
>>> enhancement
>>> bug fixing.
>>>
>>> Bruno
>>>
>>>> ________________________________
>>>> From: Rob Tompkins <[hidden email]>
>>>> To: Commons Developers List <[hidden email]>
>>>> Sent: Tuesday, 29 November 2016 11:45 AM
>>>> Subject: [text] Next steps.
>>>>
>>>>
>>>> Hello,
>>>>
>>>> I'm a tad curious what folks (along with Gary, Benedikt, and Bruno)
>> think
>>>> the next steps are for text in the hopeful thought that we are
>> eventually
>>>> heading towards a 1.0 release. Some thoughts that come to mind are:
>>>>
>>>> (1) Go over lang with fine tooth comb and see what we think should move,
>>>> (2) Go through the Lothaire "Applied Combinatorics on Words" book (
>>>> http://lipn.univ-paris13.fr/~duchamp/Books&more/Lothaire/(
>>> Encyclopedia_of_Mathematics_and_its_Applications_)M._Lothaire-Applied_
>>> Combinatorics_On_Words-Cambridge_University_Press(2005).pdf)
>>>> and minimally implement some of the standard algorithms.
>>>> (3) Implement, from the Lothaire book, some of the more complex stuff:
>>>> heavier pattern matching, and/or natural language processing,
>>>> and/or
>>>> (4) Go straight for a release.
>>>>
>>>> I'm less for (4) because I think there's probably some smaller bits of
>>> code
>>>> in lang that probably come over. I like the idea of (2) before heading
>> out
>>>> the door. Regarding (3), I would have to do considerable reading to make
>>>> considerable headway here, which I'm not opposed to doing it would just
>>>> merely prolong getting to a 1.0 release if we predicated the release
>> upon
>>>> my getting that done.
>>>>
>>>> So, what do you guys think?
>>>>
>>>> Cheers,
>>>> -Rob
>>>>
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>>
>>
>>
>> --
>> E-Mail: [hidden email] | [hidden email]
>> Java Persistence with Hibernate, Second Edition
>> <
>> https://www.amazon.com/gp/product/1617290459/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1617290459&linkCode=as2&tag=garygregory-20&linkId=cadb800f39946ec62ea2b1af9fe6a2b8
>>>
>>
>> <http:////
>> ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1617290459>
>> JUnit in Action, Second Edition
>> <
>> https://www.amazon.com/gp/product/1935182021/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1935182021&linkCode=as2&tag=garygregory-20&linkId=31ecd1f6b6d1eaf8886ac902a24de418%22
>>>
>>
>> <http:////
>> ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1935182021>
>> Spring Batch in Action
>> <
>> https://www.amazon.com/gp/product/1935182951/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1935182951&linkCode=%7B%7BlinkCode%7D%7D&tag=garygregory-20&linkId=%7B%7Blink_id%7D%7D%22%3ESpring+Batch+in+Action
>>>
>> <http:////
>> ir-na.amazon-adsystem.com/e/ir?t=garygregory-20&l=am2&o=1&a=1935182951>
>> Blog: http://garygregory.wordpress.com
>> Home: http://garygregory.com/
>> Tweet! http://twitter.com/GaryGregory
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]