[LANG] - suggestions for new Identity Hash code classes

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

[LANG] - suggestions for new Identity Hash code classes

sebb-2-2
I think I've found a fix to LANG-459 - which is to borrow some code
from Apache Axis.

The required code is a single short class - IDKey (e.g.
http://www.jdocs.com/axis/1.4/org/apache/axis/utils/IDKey.html)

It could be added as a private class of HashCodeBuilder, but it has
wider use, so should probably be added as a public class.

Seems to me it would also be useful to add IdentityHashMap and
IdentityHashSet classes as well, given that LANG still targets Java
1.3.

WDYT?

And do these classes need a new package, or would the top-level package be OK?

S///

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: [LANG] - suggestions for new Identity Hash code classes

Jörg Schaible-3
sebb wrote:

> I think I've found a fix to LANG-459 - which is to borrow some code
> from Apache Axis.
>
> The required code is a single short class - IDKey (e.g.
> http://www.jdocs.com/axis/1.4/org/apache/axis/utils/IDKey.html)
>
> It could be added as a private class of HashCodeBuilder, but it has
> wider use, so should probably be added as a public class.
>
> Seems to me it would also be useful to add IdentityHashMap and
> IdentityHashSet classes as well, given that LANG still targets Java
> 1.3.
>
> WDYT?
>
> And do these classes need a new package, or would the
> top-level package be OK?

Top leve is fine with me.

- Jörg

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [LANG] - suggestions for new Identity Hash code classes

Simon Kitching
In reply to this post by sebb-2-2
sebb schrieb:
> I think I've found a fix to LANG-459 - which is to borrow some code
> from Apache Axis.
>
> The required code is a single short class - IDKey (e.g.
> http://www.jdocs.com/axis/1.4/org/apache/axis/utils/IDKey.html)
>
> It could be added as a private class of HashCodeBuilder, but it has
> wider use, so should probably be added as a public class.
>  
Seems reasonable to me.  It does look like it will solve LANG-459. And
as you say, this problem may be reasonably common.

What about making the IDKey class final? One thing that does concern me
is that HashSet lookup using an Integer object as the key should be
pretty fast. Using an IDKey object as the key may not perform as well.
Making IDKey final would at least help. And I cannot see any reason why
someone would want to subclass IDKey.

One other concern is regarding garbage collection. From a brief look at
the code, this thread-local registry object contains a set of Integer
hashcodes. With this change, the set will now contain real hard
references to objects, preventing them from being garbage-collected
while they are in the set. Does this matter?
> Seems to me it would also be useful to add IdentityHashMap and
> IdentityHashSet classes as well, given that LANG still targets Java
> 1.3.
>
> WDYT?
>
> And do these classes need a new package, or would the top-level package be OK?
>  

No opinion.

Cheers,
Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [LANG] - suggestions for new Identity Hash code classes

sebb-2-2
On 16/09/2008, Simon Kitching <[hidden email]> wrote:

> sebb schrieb:
>
> > I think I've found a fix to LANG-459 - which is to borrow some code
> > from Apache Axis.
> >
> > The required code is a single short class - IDKey (e.g.
> >
> http://www.jdocs.com/axis/1.4/org/apache/axis/utils/IDKey.html)
> >
> > It could be added as a private class of HashCodeBuilder, but it has
> > wider use, so should probably be added as a public class.
> >
> >
>  Seems reasonable to me.  It does look like it will solve LANG-459. And as
> you say, this problem may be reasonably common.
>
>  What about making the IDKey class final? One thing that does concern me is
> that HashSet lookup using an Integer object as the key should be pretty
> fast. Using an IDKey object as the key may not perform as well. Making IDKey
> final would at least help. And I cannot see any reason why someone would
> want to subclass IDKey.
>

Good idea. I already made the fields final.

>  One other concern is regarding garbage collection. From a brief look at the
> code, this thread-local registry object contains a set of Integer hashcodes.
> With this change, the set will now contain real hard references to objects,
> preventing them from being garbage-collected while they are in the set. Does
> this matter?

The entries are only retained in the registry during the hashcode
calculation - if any were left behind they could mess up subsequent
calls in the same thread.

As far as I can see, the code always removes any items that are added,
but it would be worth double-checking that.

> > Seems to me it would also be useful to add IdentityHashMap and
> > IdentityHashSet classes as well, given that LANG still targets Java
> > 1.3.
> >
> > WDYT?
> >
> > And do these classes need a new package, or would the top-level package be
> OK?
> >
> >
>
>  No opinion.
>
>  Cheers,
>  Simon
>
>
>
> ---------------------------------------------------------------------
>  To unsubscribe, e-mail: [hidden email]
>  For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: [LANG] - suggestions for new Identity Hash code classes

Jörg Schaible-3
In reply to this post by Simon Kitching
Hi Simon,

Simon Kitching wrote:
[snip]
> One other concern is regarding garbage collection. From a
> brief look at
> the code, this thread-local registry object contains a set of Integer
> hashcodes. With this change, the set will now contain real hard
> references to objects, preventing them from being garbage-collected
> while they are in the set. Does this matter?

Good catch. I am quite sure this will matter, unfortunately. HashCodeBuilder.reflectionHashCode(Object) is widely used and if a new lang version suddenly keeps those values, we might prevent redeployments that worked so far and causing memory leaks. Using a WeakReference instead? We're getting slow :-/

- Jörg

BTW: The current implementation of HashCodeBuilder.reflectionHashCode(Object) is not wrong, its the class implementation that uses EqualsBuilder.reflectionEquals(o, this) together with it. We might therefore also create a new method HashCodeBuilder.reflectionHashCodeRespectingEquals(Object) ...

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: [LANG] - suggestions for new Identity Hash code classes

Jörg Schaible-3
In reply to this post by sebb-2-2
sebb wrote:
> On 16/09/2008, Simon Kitching <[hidden email]> wrote:
[snip]

>>  One other concern is regarding garbage collection. From a brief
>> look at the code, this thread-local registry object contains a set
>> of Integer hashcodes. With this change, the set will now contain
>> real hard references to objects, preventing them from being
>> garbage-collected while they are in the set. Does this matter?
>
> The entries are only retained in the registry during the hashcode
> calculation - if any were left behind they could mess up subsequent
> calls in the same thread.
>
> As far as I can see, the code always removes any items that are added,
> but it would be worth double-checking that.

In that case forget my other comment ;-)

- Jörg

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [LANG] - suggestions for new Identity Hash code classes

Simon Kitching
In reply to this post by sebb-2-2
sebb schrieb:

> On 16/09/2008, Simon Kitching <[hidden email]> wrote:
>  
>
>>  One other concern is regarding garbage collection. From a brief look at the
>> code, this thread-local registry object contains a set of Integer hashcodes.
>> With this change, the set will now contain real hard references to objects,
>> preventing them from being garbage-collected while they are in the set. Does
>> this matter?
>>    
>
> The entries are only retained in the registry during the hashcode
> calculation - if any were left behind they could mess up subsequent
> calls in the same thread.
>
> As far as I can see, the code always removes any items that are added,
> but it would be worth double-checking that.
>  

I did suspect that the objects are removed from the registry before
normal return. So then the IDKey approach should be fine.

But it would seem to me to be safer to have a call to registry.remove()
to ensure that the threadlocal object is completely cleared, rather than
just relying on the code to correctly balance adds/removes on the
HashSet. Or a call to getRegistry().clear(). Or at least a comment on
the registry stating that it is always empty on return from the
reflectionHashCode method etc.

It's not quite clear to me why a threadlocal is being used here. I'm
guessing it is for performance to avoid recreating the HashSet object.
Or is it to "tunnel" the registry set between cooperating classes? If
the latter, then registry.remove() seems a good idea when it is no
longer used, to avoid leaking an object per thread.

Sorry for the vague comments; I don't have time to really get to grips
with the code ATM..

Regards,
Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [LANG] - suggestions for new Identity Hash code classes

sebb-2-2
In reply to this post by Jörg Schaible-3
On 16/09/2008, Jörg Schaible <[hidden email]> wrote:

> Hi Simon,
>
>  Simon Kitching wrote:
>  [snip]
>
> > One other concern is regarding garbage collection. From a
>  > brief look at
>  > the code, this thread-local registry object contains a set of Integer
>  > hashcodes. With this change, the set will now contain real hard
>  > references to objects, preventing them from being garbage-collected
>  > while they are in the set. Does this matter?
>
>
> Good catch. I am quite sure this will matter, unfortunately. HashCodeBuilder.reflectionHashCode(Object) is widely used and if a new lang version suddenly keeps those values, we might prevent redeployments that worked so far and causing memory leaks. Using a WeakReference instead? We're getting slow :-/
>
>  - Jörg
>
>  BTW: The current implementation of HashCodeBuilder.reflectionHashCode(Object) is not wrong, its the class implementation that uses EqualsBuilder.reflectionEquals(o, this) together with it.

I disagree: the current implementation assumes that it can uniquely
map objects to Integers. This assumption does not depend on the
definition of equals.

> We might therefore also create a new method HashCodeBuilder.reflectionHashCodeRespectingEquals(Object) ...
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: [hidden email]
>  For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [LANG] - suggestions for new Identity Hash code classes

sebb-2-2
In reply to this post by Simon Kitching
On 16/09/2008, Simon Kitching <[hidden email]> wrote:

> sebb schrieb:
>
> > On 16/09/2008, Simon Kitching <[hidden email]> wrote:
> >
> >
> > >  One other concern is regarding garbage collection. From a brief look at
> the
> > > code, this thread-local registry object contains a set of Integer
> hashcodes.
> > > With this change, the set will now contain real hard references to
> objects,
> > > preventing them from being garbage-collected while they are in the set.
> Does
> > > this matter?
> > >
> > >
> >
> > The entries are only retained in the registry during the hashcode
> > calculation - if any were left behind they could mess up subsequent
> > calls in the same thread.
> >
> > As far as I can see, the code always removes any items that are added,
> > but it would be worth double-checking that.
> >
> >
>
>  I did suspect that the objects are removed from the registry before normal
> return. So then the IDKey approach should be fine.
>
>  But it would seem to me to be safer to have a call to registry.remove() to
> ensure that the threadlocal object is completely cleared, rather than just
> relying on the code to correctly balance adds/removes on the HashSet.

The add and remove are in the same try block; remove is in the finally clause.

> Or a call to getRegistry().clear().

Could do, but seems like overkill.

> Or at least a comment on the registry stating
> that it is always empty on return from the reflectionHashCode method etc.

OK.

>  It's not quite clear to me why a threadlocal is being used here. I'm
> guessing it is for performance to avoid recreating the HashSet object.

I don't know for sure, but that is probably the case.

> Or is
> it to "tunnel" the registry set between cooperating classes?

Not at present, although the registy access methods are package protected.

That may be a mistake...

> If the latter,
> then registry.remove() seems a good idea when it is no longer used, to avoid
> leaking an object per thread.
>
>  Sorry for the vague comments; I don't have time to really get to grips with
> the code ATM..

NP, it's always useful to thrash such things out at the design stage
rather than after deployment ...

>  Regards,
>
>  Simon
>
>
> ---------------------------------------------------------------------
>  To unsubscribe, e-mail: [hidden email]
>  For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: [LANG] - suggestions for new Identity Hash code classes

Jörg Schaible-3
sebb wrote:
> On 16/09/2008, Simon Kitching <[hidden email]> wrote:
[snip]

>>  It's not quite clear to me why a threadlocal is being used here. I'm
>> guessing it is for performance to avoid recreating the HashSet
>> object.
>
> I don't know for sure, but that is probably the case.
>
>> Or is
>> it to "tunnel" the registry set between cooperating classes?
>
> Not at present, although the registy access methods are
> package protected.
>
> That may be a mistake...

My guess: It's used in case of a circular object graph ...

- Jörg

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [LANG] - suggestions for new Identity Hash code classes

sebb-2-2
On 16/09/2008, Jörg Schaible <[hidden email]> wrote:

> sebb wrote:
>  > On 16/09/2008, Simon Kitching <[hidden email]> wrote:
>
> [snip]
>
> >>  It's not quite clear to me why a threadlocal is being used here. I'm
>  >> guessing it is for performance to avoid recreating the HashSet
>  >> object.
>  >
>  > I don't know for sure, but that is probably the case.
>  >
>  >> Or is
>  >> it to "tunnel" the registry set between cooperating classes?
>  >
>  > Not at present, although the registy access methods are
>  > package protected.
>  >
>  > That may be a mistake...
>
>
> My guess: It's used in case of a circular object graph ...
>

Yes, they are used in the enclosing class for just that purpose.

But as far as I can see there is no need for the methods to be
anything other than private - and the contents of the HashMap only
contain anything during the calculation of the hashcode, so it's not
obvious how the methods could be called anyway.

I meant it was probably a mistake for the methods to be package protected.

>  - Jörg
>
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: [hidden email]
>  For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [LANG] - suggestions for new Identity Hash code classes

Simon Kitching
sebb schrieb:

> On 16/09/2008, Jörg Schaible <[hidden email]> wrote:
>  
>> sebb wrote:
>>  > On 16/09/2008, Simon Kitching <[hidden email]> wrote:
>>
>> [snip]
>>
>>    
>>>>  It's not quite clear to me why a threadlocal is being used here. I'm
>>>>        
>>  >> guessing it is for performance to avoid recreating the HashSet
>>  >> object.
>>  >
>>  > I don't know for sure, but that is probably the case.
>>  >
>>  >> Or is
>>  >> it to "tunnel" the registry set between cooperating classes?
>>  >
>>  > Not at present, although the registy access methods are
>>  > package protected.
>>  >
>>  > That may be a mistake...
>>
>>
>> My guess: It's used in case of a circular object graph ...
>>
>>    
>
> Yes, they are used in the enclosing class for just that purpose.
>
> But as far as I can see there is no need for the methods to be
> anything other than private - and the contents of the HashMap only
> contain anything during the calculation of the hashcode, so it's not
> obvious how the methods could be called anyway.
>
> I meant it was probably a mistake for the methods to be package protected.
>  
In that case, I wonder why it bothers to use a thread-local at all,
rather than just passing the registry object around as a parameter...

Is the performance benefit of avoiding creation of a HashSet object
really measurable given that this class is doing significant amounts of
reflection?

Because this class never calls ThreadLocal.remove(), the created HashSet
object will remain attached to every thread that ever was active when a
HashCodeBuilder call was made. That's not *too* bad, ie it seems to me
that it would be acceptable as long as there was a corresponding benefit
somewhere. But I can't see the benefit ATM..


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [LANG] - suggestions for new Identity Hash code classes

sebb-2-2
In reply to this post by sebb-2-2
On 16/09/2008, sebb <[hidden email]> wrote:

> I think I've found a fix to LANG-459 - which is to borrow some code
>  from Apache Axis.
>
>  The required code is a single short class - IDKey (e.g.
>  http://www.jdocs.com/axis/1.4/org/apache/axis/utils/IDKey.html)
>
>  It could be added as a private class of HashCodeBuilder, but it has
>  wider use, so should probably be added as a public class.
>
>  Seems to me it would also be useful to add IdentityHashMap and
>  IdentityHashSet classes as well, given that LANG still targets Java
>  1.3.

Just noticed that Commons Collection already has IdentityMap, so no
point adding it to Lang.

>  WDYT?
>
>  And do these classes need a new package, or would the top-level package be OK?
>
>
>  S///
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [LANG] - suggestions for new Identity Hash code classes

sebb-2-2
In reply to this post by Simon Kitching
On 16/09/2008, Simon Kitching <[hidden email]> wrote:

> sebb schrieb:
>
> > On 16/09/2008, Jörg Schaible <[hidden email]> wrote:
> >
> >
> > > sebb wrote:
> > >  > On 16/09/2008, Simon Kitching <[hidden email]> wrote:
> > >
> > > [snip]
> > >
> > >
> > >
> > > >
> > > > >  It's not quite clear to me why a threadlocal is being used here.
> I'm
> > > > >
> > > > >
> > > >
> > >  >> guessing it is for performance to avoid recreating the HashSet
> > >  >> object.
> > >  >
> > >  > I don't know for sure, but that is probably the case.
> > >  >
> > >  >> Or is
> > >  >> it to "tunnel" the registry set between cooperating classes?
> > >  >
> > >  > Not at present, although the registy access methods are
> > >  > package protected.
> > >  >
> > >  > That may be a mistake...
> > >
> > >
> > > My guess: It's used in case of a circular object graph ...
> > >
> > >
> > >
> >
> > Yes, they are used in the enclosing class for just that purpose.
> >
> > But as far as I can see there is no need for the methods to be
> > anything other than private - and the contents of the HashMap only
> > contain anything during the calculation of the hashcode, so it's not
> > obvious how the methods could be called anyway.
> >
> > I meant it was probably a mistake for the methods to be package protected.
> >
> >
>  In that case, I wonder why it bothers to use a thread-local at all, rather
> than just passing the registry object around as a parameter...
>
>  Is the performance benefit of avoiding creation of a HashSet object really
> measurable given that this class is doing significant amounts of reflection?
>
>  Because this class never calls ThreadLocal.remove(), the created HashSet
> object will remain attached to every thread that ever was active when a
> HashCodeBuilder call was made. That's not *too* bad, ie it seems to me that
> it would be acceptable as long as there was a corresponding benefit
> somewhere. But I can't see the benefit ATM..

Ditto.

I think it would make also sense to privatise the add/remove etc methods.

However, this would change the API.
But as that part of the API seems to be useless, I would not object to
it being removed.

>
> ---------------------------------------------------------------------
>  To unsubscribe, e-mail: [hidden email]
>  For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]