[jira] [Created] (COLLECTIONS-417) AbstractLinkedList.retainAll() is very slow

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (COLLECTIONS-417) AbstractLinkedList.retainAll() is very slow

AD_LB (Jira)
Adrian Nistor created COLLECTIONS-417:
-----------------------------------------

             Summary: AbstractLinkedList.retainAll() is very slow
                 Key: COLLECTIONS-417
                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-417
             Project: Commons Collections
          Issue Type: Bug
    Affects Versions: 3.2.1
         Environment: java 1.6.0_24
Ubuntu 11.10
            Reporter: Adrian Nistor
         Attachments: Test.java, patch.diff

Hi,

I am encountering a performance problem in
AbstractLinkedList.retainAll().  It appears in version 3.2.1 and also
in revision 1355448.  I attached a test that exposes this problem and
a one-line patch that fixes it.  On my machine, for this test, the
patch provides a 158X speedup.

To run the test, just do:

$ java Test

The output for the un-patched version is:
Time is 5531

The output for the patched version is:
Time is 35

As the patch shows, the problem is that
"AbstractLinkedList.retainAll(Collection<?> coll)" performs
"coll.contains(it.next())" for each element in the AbstractLinkedList,
which can be very expensive if "coll.contains()" is expensive, e.g.,
when "coll" is a list.

The one-line patch I attached puts the elements of "coll" in a HashSet
(which has very fast "contains()"), if "coll" is not already a set:

"if (!(coll instanceof java.util.Set<?>)) coll = new java.util.HashSet<Object>(coll);"

Is this a bug, or am I misunderstanding the intended behavior? If so,
can you please confirm that the patch is correct?

Thanks,

Adrian


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (COLLECTIONS-417) AbstractLinkedList.retainAll() is very slow

AD_LB (Jira)

     [ https://issues.apache.org/jira/browse/COLLECTIONS-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrian Nistor updated COLLECTIONS-417:
--------------------------------------

    Attachment: Test.java
                patch.diff
   

> AbstractLinkedList.retainAll() is very slow
> -------------------------------------------
>
>                 Key: COLLECTIONS-417
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-417
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in
> AbstractLinkedList.retainAll().  It appears in version 3.2.1 and also
> in revision 1355448.  I attached a test that exposes this problem and
> a one-line patch that fixes it.  On my machine, for this test, the
> patch provides a 158X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 5531
> The output for the patched version is:
> Time is 35
> As the patch shows, the problem is that
> "AbstractLinkedList.retainAll(Collection<?> coll)" performs
> "coll.contains(it.next())" for each element in the AbstractLinkedList,
> which can be very expensive if "coll.contains()" is expensive, e.g.,
> when "coll" is a list.
> The one-line patch I attached puts the elements of "coll" in a HashSet
> (which has very fast "contains()"), if "coll" is not already a set:
> "if (!(coll instanceof java.util.Set<?>)) coll = new java.util.HashSet<Object>(coll);"
> Is this a bug, or am I misunderstanding the intended behavior? If so,
> can you please confirm that the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (COLLECTIONS-417) AbstractLinkedList.retainAll() is very slow

AD_LB (Jira)
In reply to this post by AD_LB (Jira)

    [ https://issues.apache.org/jira/browse/COLLECTIONS-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445397#comment-13445397 ]

Adrian Nistor commented on COLLECTIONS-417:
-------------------------------------------

I am attaching a patch with the javadoc clarification on the runtime
complexity of the method.  This javadoc patch is almost identical to
the javadoc added for COLLECTIONS-416 and COLLECTIONS-418.  As
discussed in COLLECTIONS-416, users should use a data structure for
the elements to be retained which supports a fast implementation of
contains.
               

> AbstractLinkedList.retainAll() is very slow
> -------------------------------------------
>
>                 Key: COLLECTIONS-417
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-417
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: docPatch.diff, patch.diff, Test.java
>
>
> Hi,
> I am encountering a performance problem in
> AbstractLinkedList.retainAll().  It appears in version 3.2.1 and also
> in revision 1355448.  I attached a test that exposes this problem and
> a one-line patch that fixes it.  On my machine, for this test, the
> patch provides a 158X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 5531
> The output for the patched version is:
> Time is 35
> As the patch shows, the problem is that
> "AbstractLinkedList.retainAll(Collection<?> coll)" performs
> "coll.contains(it.next())" for each element in the AbstractLinkedList,
> which can be very expensive if "coll.contains()" is expensive, e.g.,
> when "coll" is a list.
> The one-line patch I attached puts the elements of "coll" in a HashSet
> (which has very fast "contains()"), if "coll" is not already a set:
> "if (!(coll instanceof java.util.Set<?>)) coll = new java.util.HashSet<Object>(coll);"
> Is this a bug, or am I misunderstanding the intended behavior? If so,
> can you please confirm that the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (COLLECTIONS-417) AbstractLinkedList.retainAll() is very slow

AD_LB (Jira)
In reply to this post by AD_LB (Jira)

     [ https://issues.apache.org/jira/browse/COLLECTIONS-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrian Nistor updated COLLECTIONS-417:
--------------------------------------

    Attachment: docPatch.diff
   

> AbstractLinkedList.retainAll() is very slow
> -------------------------------------------
>
>                 Key: COLLECTIONS-417
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-417
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: docPatch.diff, patch.diff, Test.java
>
>
> Hi,
> I am encountering a performance problem in
> AbstractLinkedList.retainAll().  It appears in version 3.2.1 and also
> in revision 1355448.  I attached a test that exposes this problem and
> a one-line patch that fixes it.  On my machine, for this test, the
> patch provides a 158X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 5531
> The output for the patched version is:
> Time is 35
> As the patch shows, the problem is that
> "AbstractLinkedList.retainAll(Collection<?> coll)" performs
> "coll.contains(it.next())" for each element in the AbstractLinkedList,
> which can be very expensive if "coll.contains()" is expensive, e.g.,
> when "coll" is a list.
> The one-line patch I attached puts the elements of "coll" in a HashSet
> (which has very fast "contains()"), if "coll" is not already a set:
> "if (!(coll instanceof java.util.Set<?>)) coll = new java.util.HashSet<Object>(coll);"
> Is this a bug, or am I misunderstanding the intended behavior? If so,
> can you please confirm that the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (COLLECTIONS-417) AbstractLinkedList.retainAll() is very slow

AD_LB (Jira)
In reply to this post by AD_LB (Jira)

     [ https://issues.apache.org/jira/browse/COLLECTIONS-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Neidhart resolved COLLECTIONS-417.
-----------------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0

Applied patch in r1388146. Thanks for the patch!
               

> AbstractLinkedList.retainAll() is very slow
> -------------------------------------------
>
>                 Key: COLLECTIONS-417
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-417
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>             Fix For: 4.0
>
>         Attachments: docPatch.diff, patch.diff, Test.java
>
>
> Hi,
> I am encountering a performance problem in
> AbstractLinkedList.retainAll().  It appears in version 3.2.1 and also
> in revision 1355448.  I attached a test that exposes this problem and
> a one-line patch that fixes it.  On my machine, for this test, the
> patch provides a 158X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 5531
> The output for the patched version is:
> Time is 35
> As the patch shows, the problem is that
> "AbstractLinkedList.retainAll(Collection<?> coll)" performs
> "coll.contains(it.next())" for each element in the AbstractLinkedList,
> which can be very expensive if "coll.contains()" is expensive, e.g.,
> when "coll" is a list.
> The one-line patch I attached puts the elements of "coll" in a HashSet
> (which has very fast "contains()"), if "coll" is not already a set:
> "if (!(coll instanceof java.util.Set<?>)) coll = new java.util.HashSet<Object>(coll);"
> Is this a bug, or am I misunderstanding the intended behavior? If so,
> can you please confirm that the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira