[jira] [Created] (COLLECTIONS-420) CompositeCollection.containsAll() is very slow

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (COLLECTIONS-420) CompositeCollection.containsAll() is very slow

Gilles Sadowski (Jira)
Adrian Nistor created COLLECTIONS-420:
-----------------------------------------

             Summary: CompositeCollection.containsAll() is very slow
                 Key: COLLECTIONS-420
                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-420
             Project: Commons Collections
          Issue Type: Bug
    Affects Versions: 3.2.1
         Environment: java 1.6.0_24
Ubuntu 11.10
            Reporter: Adrian Nistor


Hi,

I am encountering a performance problem in
CompositeCollection.containsAll().  It appears in version 3.2.1 and
also in revision 1355448.  I attached a test that exposes this problem
and a patch that fixes it.  On my machine, for this test, the patch
provides a 167X speedup.

To run the test, just do:

$ java Test

The output for the un-patched version is:
Time is 9192

The output for the patched version is:
Time is 54

The problem is that
"CompositeCollection.containsAll(Collection<?> coll)" performs
"contains(item)" for each item in "coll", and "contains(item)"
searches "item" in all collections in CompositeCollection.  This can
be very slow if the collections in CompositeCollection have slow
"contains()", e.g., when these collections are lists.

The patch I attached puts the elements in each collection of
CompositeCollection in a HashSet (which has very fast "contains()") if
that collection is not already a set.  For efficiency, putting
elements in several HashSet objects is performed lazily in the patch.

Is this a bug, or am I misunderstanding the intended behavior?  If so,
can you please confirm that the patch is correct?

Thanks,

Adrian


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (COLLECTIONS-420) CompositeCollection.containsAll() is very slow

Gilles Sadowski (Jira)

     [ https://issues.apache.org/jira/browse/COLLECTIONS-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrian Nistor updated COLLECTIONS-420:
--------------------------------------

    Attachment: Test.java
                patch.diff
   

> CompositeCollection.containsAll() is very slow
> ----------------------------------------------
>
>                 Key: COLLECTIONS-420
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-420
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in
> CompositeCollection.containsAll().  It appears in version 3.2.1 and
> also in revision 1355448.  I attached a test that exposes this problem
> and a patch that fixes it.  On my machine, for this test, the patch
> provides a 167X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 9192
> The output for the patched version is:
> Time is 54
> The problem is that
> "CompositeCollection.containsAll(Collection<?> coll)" performs
> "contains(item)" for each item in "coll", and "contains(item)"
> searches "item" in all collections in CompositeCollection.  This can
> be very slow if the collections in CompositeCollection have slow
> "contains()", e.g., when these collections are lists.
> The patch I attached puts the elements in each collection of
> CompositeCollection in a HashSet (which has very fast "contains()") if
> that collection is not already a set.  For efficiency, putting
> elements in several HashSet objects is performed lazily in the patch.
> Is this a bug, or am I misunderstanding the intended behavior?  If so,
> can you please confirm that the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|

[jira] [Resolved] (COLLECTIONS-420) CompositeCollection.containsAll() is very slow

Gilles Sadowski (Jira)
In reply to this post by Gilles Sadowski (Jira)

     [ https://issues.apache.org/jira/browse/COLLECTIONS-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Neidhart resolved COLLECTIONS-420.
-----------------------------------------

    Resolution: Won't Fix

CompositeCollection is a decorator for existing collections. Operations shall not alter the backing collections. To support a unified view on collections with a fast contains, users should better use a CompositeSet and convert the collections to a set themselves.
               

> CompositeCollection.containsAll() is very slow
> ----------------------------------------------
>
>                 Key: COLLECTIONS-420
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-420
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: patch.diff, Test.java
>
>
> Hi,
> I am encountering a performance problem in
> CompositeCollection.containsAll().  It appears in version 3.2.1 and
> also in revision 1355448.  I attached a test that exposes this problem
> and a patch that fixes it.  On my machine, for this test, the patch
> provides a 167X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 9192
> The output for the patched version is:
> Time is 54
> The problem is that
> "CompositeCollection.containsAll(Collection<?> coll)" performs
> "contains(item)" for each item in "coll", and "contains(item)"
> searches "item" in all collections in CompositeCollection.  This can
> be very slow if the collections in CompositeCollection have slow
> "contains()", e.g., when these collections are lists.
> The patch I attached puts the elements in each collection of
> CompositeCollection in a HashSet (which has very fast "contains()") if
> that collection is not already a set.  For efficiency, putting
> elements in several HashSet objects is performed lazily in the patch.
> Is this a bug, or am I misunderstanding the intended behavior?  If so,
> can you please confirm that the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira