[GitHub] commons-text pull request #49: TEXT-89: UTF-32 support for WordUtils.initial...

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text pull request #49: TEXT-89: UTF-32 support for WordUtils.initial...

mureinik
GitHub user arunvinudss opened a pull request:

    https://github.com/apache/commons-text/pull/49

    TEXT-89: UTF-32 support for WordUtils.initials

    @chtompki Adding support for surrogate pairs to WordUtils.initials. Characters outside BMP can be used now and also added unit tests for surrogate pairs .

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/arunvinudss/commons-text TEXT-89

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/commons-text/pull/49.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #49
   
----
commit 321e08e368ae4da92079a66dff7eeb0de89b6072
Author: Arun Vinud <[hidden email]>
Date:   2017-06-14T06:46:12Z

    TEXT-89: Added support for UTF-32

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials

mureinik
Github user ecki commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    I think using UTF-32 to describe UTF-16 with surrogate pairs is a bit misleading. So I would adjust subject and comment, as this does not enable fixed-4-byte characters.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user arunvinudss commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    @ecki  I agree and thats the best we can do for now to support UTF-32 . Thanks for the input 👍


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user ameyjadiye commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    @chtompki , this PR looks good, shall we merge it ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user ecki commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    It still uses missleading UTF-32


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user jbduncan commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    It's not clear to me why the ordering of the imports has changed and why all the `assert*()` method imports have been replaced with a wildcard import. Are those intentional?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user arunvinudss commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    Ordering of import has changed as all the similar packages were grouped together. It was not intended I believe my IDE did that automatically. I see other test classes using wild card imports too not sure we have a standard for that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user ameyjadiye commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    @arunvinudss , can you please rename PR saying character processing switched with codepoints and  change commit comment as well, that you can do with ```rebase``` or ```git commit --amend``` and then force push. with this we can accept and close this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user arunvinudss commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    Not sure why I should change the PR name? I never mentioned it provides native support for UTF-32 and it clearly states support using surrogate pairs .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user ameyjadiye commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    probably just because utf-32 it could be confusing, @ecki can explain more.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user ecki commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    It does not provide support for UTF-32 (which is a 4 byte encoding not implemented in the patch) but it provides support for UTF-16 surrogate pairs (1..2 x 2bytes). The both encodings are not compatible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user arunvinudss commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    Okay I don't mind to change but what does UTF-16 with surrogate pairs support? Why did we do this refactoring?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user ecki commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    Not sure I understand the question, surrogate Pairs only exist in UTF-16. UTF-8 uses a multi byte encoding for code points outside the BMP and UTF-32 uses 4 bytes (and skips the high/low surrogate regions)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: UTF-32 support for WordUtils.initials using...

mureinik
In reply to this post by mureinik
Github user ameyjadiye commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    For detailed explanation.
   
    https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: WordUtils.initials support for UTF-16 surro...

mureinik
In reply to this post by mureinik
Github user ameyjadiye commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    @arunvinudss , amend commit comment as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: WordUtils.initials support for UTF-16 surro...

mureinik
In reply to this post by mureinik
Github user coveralls commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
   
    [![Coverage Status](https://coveralls.io/builds/12387036/badge)](https://coveralls.io/builds/12387036)
   
    Coverage decreased (-0.2%) to 97.126% when pulling **15c2e4b28686edf6f0807304367dba82ac3d359d on arunvinudss:TEXT-89** into **aaf4aba369ed0b97d17bc9343f763b0d099dbc2f on apache:master**.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: WordUtils.initials support for UTF-16 surro...

mureinik
In reply to this post by mureinik
Github user arunvinudss commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    @ameyjadiye Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: WordUtils.initials support for UTF-16 surro...

mureinik
In reply to this post by mureinik
Github user ameyjadiye commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    @chtompki , @PascalSchumacher this seems good to me for merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text issue #49: TEXT-89: WordUtils.initials support for UTF-16 surro...

mureinik
In reply to this post by mureinik
Github user chtompki commented on the issue:

    https://github.com/apache/commons-text/pull/49
 
    Cool. Will look at this in just a bit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[GitHub] commons-text pull request #49: TEXT-89: WordUtils.initials support for UTF-1...

mureinik
In reply to this post by mureinik
Github user asfgit closed the pull request at:

    https://github.com/apache/commons-text/pull/49


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [hidden email] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...